Cloudfare

There was some issues with Cloudflare. Many sites were down for a few hours, not just Gaf.
But it seems they have sorted things out.


Internet infrastructure company Cloudflare was hit by an outage on Tuesday, knocking several major websites offline for global users.

Many sites came back online within a few hours. In an update to its status page around 9:57 a.m. ET, Cloudflare said it had implemented a fix to resolve the issues, though it noted some users may still experience issues accessing its online dashboard.

"We are continuing to monitor for errors to ensure all services are back to normal," the company added.
A Cloudflare spokesperson said the company observed a "spike in unusual traffic" to one of its services around 6:20 a.m. ET, causing some traffic passing through its network to experience errors.
"We do not yet know the cause of the spike in unusual traffic," the spokesperson added. "We are all hands on deck to make sure all traffic is served without errors."

A "spike in unusual traffic" sounds a lot like a DDOS attack. The question is who did it?
 
Man, imagine storing your files on S3, have some work run in AKS and utilize Cloudflare as your CDN. Would have been a crazy month.
 
Base a lot of Internet infrastructure on several companies, highly qualified technicians are hard to get because they've been already replaced by AI - what can go wrong?
 
It was allegedly a maintenance issue and not a disruption.

Still, it serves as an example of what could occur during a world wide outage. Dependence of a few big multipolar providers results in SPOFs.

We need our own Gafflare.
I was thinking about suggesting a GAF DoSS shield for redundancy or fallback. Already seen some custom solutions in the wild. Could be pretty expensive though.
 
Got any gaf
dave chappelle tyrone biggums GIF
 

I've been warning people about the threat of cyber attacks increasing exponentially over the coming years as we are seeing daily attacks at work (British nuclear defence).

The attacks come from Asia mostly, but since the AI and LLM boom the cybersec team have been expanded 3x and are stacked out daily. One of the lead engineers said to us it is getting harder and harder to stop and that Cloudflare is critical. I invested heavily in the stock 3 years ago.
 

When Cloudflare experienced a massive outage on Monday, many people, including the company's engineers, initially suspected a sophisticated DDoS attack. The company later explained that a flawed update to its server infrastructure caused a single file to malfunction. Several major outages in recent years have resulted from similar single points of failure.
However, the company eventually discovered that, when it changed a permission in a database system under a mistaken assumption about its behavior, it doubled the size of a file critical to Cloudflare's bot manager. This manager, which directs automated traffic through the company's systems, updates continuously in response to ever-evolving threats but also contains certain file size limits to minimize memory consumption and ensure smooth performance.

When the bot manager updated with the inflated file, which exceeded those limits, the result was an error. The glitches were initially intermittent due to the time needed for the faulty file to update throughout the entire system. Cloudflare resolved the issue by reverting to an earlier version of the file at 11:30 and had restored all operations by noon.

That is it. A simple update and part of the internet went out....
 
So the sensible thing going forward would be have more than one input needed and roll out updates on smaller or secure parts of the system not just have 1 file corrupt the whole thing as soon as it's used in a live environment.
 
Top Bottom