How a Single Rust Exception Crashed Cloudflare: A Deep Dive into the 2025 Outage (2026)

Imagine a single, unnoticed mistake in a piece of code bringing a significant portion of the internet to its knees. That’s exactly what happened on November 18, 2025, when Cloudflare’s infrastructure experienced a catastrophic failure akin to a digital heart attack. After frantic efforts to restore services, engineers uncovered a shocking truth: a seemingly minor issue in a dynamically generated input file triggered an uncaught exception in Cloudflare’s Rust-based FL2 proxy, leading to widespread HTTP 5xx errors and halting customer traffic. Interestingly, users on the older FL proxy remained unaffected, raising questions about the stability of newer systems. But here’s where it gets controversial: Was this a preventable oversight, or an inevitable consequence of modern software complexity?

The culprit was a features file, dynamically created based on customer settings like bot traffic management. A recent change caused this file to contain duplicate rows, bloating its size from around 60 features to over 200. This was problematic because the FL2 proxy pre-allocates memory for this data, and the sudden increase exceeded its capacity. While the older FL proxy code gracefully handled such scenarios, the FL2 code blindly processed the error-prone input, triggering a cascade of failures. The issue culminated in a panic: thread fl2workerthread panicked: called Result::unwrap() on an Err value.

And this is the part most people miss: The root cause wasn’t Rust’s memory safety features failing—it was basic error handling and input validation being overlooked. The oversized file wasn’t flagged until it collided with the pre-allocated memory, a classic example of assuming inputs will always be within expected limits. As we’ve highlighted before (https://hackaday.com/2024/02/29/the-white-house-memory-safety-appeal-is-a-security-red-herring/), input validation and error handling remain the leading causes of critical vulnerabilities, regardless of the programming language. Even Rust, celebrated for its memory safety, can’t protect against human oversight.

This incident serves as a stark reminder: no language or tool can replace rigorous testing, validation, and exception handling. Here’s a thought-provoking question: Are we too quick to trust new technologies without addressing age-old programming fundamentals? Cloudflare’s experience suggests we might be. Hopefully, they’ve reverted to the reliable FL proxy and are reevaluating the need to rewrite code that wasn’t broken in the first place.

What’s your take? Is this a wake-up call for the industry, or an isolated incident? Share your thoughts in the comments—let’s spark a conversation about where we’re going wrong and how we can do better.

How a Single Rust Exception Crashed Cloudflare: A Deep Dive into the 2025 Outage (2026)

References

Top Articles
Latest Posts
Recommended Articles
Article information

Author: Velia Krajcik

Last Updated:

Views: 6076

Rating: 4.3 / 5 (74 voted)

Reviews: 81% of readers found this page helpful

Author information

Name: Velia Krajcik

Birthday: 1996-07-27

Address: 520 Balistreri Mount, South Armand, OR 60528

Phone: +466880739437

Job: Future Retail Associate

Hobby: Polo, Scouting, Worldbuilding, Cosplaying, Photography, Rowing, Nordic skating

Introduction: My name is Velia Krajcik, I am a handsome, clean, lucky, gleaming, magnificent, proud, glorious person who loves writing and wants to share my knowledge and understanding with you.