User topics

Article topics

I got 502 problems, and Cloudflare sure is one: Outage interrupts your El Reg-reading pleasure for almost half an hour

Cloudflare, the outfit noted for the slogan "helping build a better internet", had another wobble today as "network performance issues" rendered websites around the globe inaccessible. The US tech biz updated its status page at 1352 UTC to indicate that it was aware of issues, but things began tottering quite a bit earlier. …

COMMENTS

Post your comment

House rules Send corrections

Add to 'My topics'

Tuesday 2nd July 2019 15:48 GMT Spacedinvader

Is El Reg

pondering if maybe they depend on its services just a little too much?

25 0 Reply
1. Tuesday 2nd July 2019 17:03 GMT Martin Summers
  
  Re: Is El Reg
  
  Well it seems these issues have only been happening since El Reg moved to using them. I think it's just so popular they keep knocking Cloudflare over. </end brown nosing>
  
  24 0 Reply
  1. Tuesday 2nd July 2019 17:40 GMT diodesign
    
    'knocking Cloudflare over'
    
    Aw shucks. Don't let the other websites know, they'll only get envious.
    
    C.
    
    13 0 Reply
    1. Tuesday 2nd July 2019 20:05 GMT Youngone
      
      Re: 'knocking Cloudflare over'
      
      Wait, there are other websites?
      
      30 0 Reply
      1. Wednesday 3rd July 2019 07:56 GMT Flywheel
        
        Re: 'knocking Cloudflare over'
        
        Nah, that's Fake News, surely ?!
        
        3 0 Reply
2. Tuesday 2nd July 2019 17:37 GMT diodesign
  
  Re: Is El Reg
  
  Given that we've faced multi-gigabit DDoS waves in the past for annoying black hats, Cloudflare's CDN is particularly useful in staying online at the moment.
  
  We are planning to expand our infrastructure tho to improve connectivity (and then IPv6 etc etc)
  
  C.
  
  32 0 Reply
  1. Wednesday 3rd July 2019 08:41 GMT Nick Ryan
    
    Re: Is El Reg
    
    Given that we've faced multi-gigabit DDoS waves in the past for annoying black hats, Cloudflare's CDN is particularly useful in staying online at the moment.
    
    ...and Microsoft and Apple and IBM and Sun and Google and Adobe.. and [n]. :)
    
    Unless it turned into a sales advertorial for cloudfare, a write up of the scale and what it takes to keep el reg online, it would be quite an interesting write-up for us commentards to read. Without wanting to encourage more attacks of course...
    
    3 0 Reply
Tuesday 2nd July 2019 15:52 GMT ArrZarr

I will most certainly be using Cloudflare's update including the phrase "...caused primary and secondary systems to fall over." as a reason to include the term "Falling over" as a technical term for TITSUP* situations. If it's good enough for their CEO, it's good enough for me.

*Total Inability To Send Users Pages

23 0 Reply
1. Tuesday 2nd July 2019 16:14 GMT Anonymous Coward
  
  Gin or Vodka?
  
  When Cloud Faire engineers go on a bender at work; is their favorite spirit Gin or Vodka? Obviously day drinking is a requirement for working at Cloud Flare... But all their techs should be reminded that drinking Gin or Whisky is preferable to Vodak; as then management can tell customers their Brainiacs were drunk, not Stupid!
  
  0 11 Reply
2. Tuesday 2nd July 2019 19:58 GMT jake
  
  "falling over" or "fell over" ...
  
  ... has long been used for a system crash, ABEND or other TITSUP[0] event. See here. I don't know how far back the term goes, but it was in common use when I was hacking the pre-BSD kernel at Berkeley.
  
  [0] Today It Totally Stopped User Processes
  
  10 0 Reply
  1. Wednesday 3rd July 2019 07:51 GMT ArrZarr
    
    Re: "falling over" or "fell over" ...
    
    It has, however, always been a fairly informal way of indicating a TITSUP*. Hopefully we can now treat it as a technical term.
    
    *Toppling IT Servers Uncovering Problems
    
    2 0 Reply
    1. Wednesday 3rd July 2019 08:44 GMT jake
      
      Re: "falling over" or "fell over" ...
      
      You may call it "informal", but I've been hearing it at Board level meetings and seeing it written in failure reports for several decades now. At the very least, it's in the common vernacular.
      
      When you think about it, it is one of the few technical terms that you don't have to translate into single syllable words before the C* suite understands it. Handy.
      
      1 0 Reply
      1. Wednesday 3rd July 2019 14:30 GMT ArrZarr
        
        Re: "falling over" or "fell over" ...
        
        Screw the C-Suite, I'm talking about client comms here - the people doing the work at the client will also need to translate "Greatly increased CPU load leading to cascading server failures" into "Fell over", but the externally facing paper trail is the formal bit.
        
        0 0 Reply
        
        Wednesday 3rd July 2019 17:00 GMT jake
        
        Re: "falling over" or "fell over" ...
        
        I've been using it with clients for decades, too.
        
        It's a known thing everywhere I'm aware of.
        
        Try it (in your example "The computer was overloaded and fell over"). Report back.
        
        1 0 Reply
Tuesday 2nd July 2019 16:00 GMT Falmari

Ah so that explains why access to El Reg was crap about an hour ago.

6 0 Reply
Tuesday 2nd July 2019 16:09 GMT Pen-y-gors

502

Got a 502 on ElReg a bit earlier - assumed it was just Putin or the Chinese DDOSing again.

2 2 Reply
Tuesday 2nd July 2019 16:20 GMT LeahroyNake

All eggs one basket?

Now it seems that you can put your eggs in lots of baskets and if one of them goes bye bye than so does whatever is depending on it / ElReg.

12 0 Reply
1. Tuesday 2nd July 2019 17:27 GMT Vometia Munro
  
  Re: All eggs one basket?
  
  Hmm, saw "eggs" and managed to read the rest of the sentence as being about lots of breakfasts and digesting.
  
  2 0 Reply
2. Thursday 4th July 2019 05:21 GMT VikiAi
  
  Re: All eggs one basket?
  
  Distributing your eggs in many baskets, then putting all your baskets in one cart.
  
  2 0 Reply
Tuesday 2nd July 2019 16:31 GMT Chris G

Old man shouts at cloud

That was me, I was also shouting at a pc and a tablet!

14 0 Reply
Tuesday 2nd July 2019 16:49 GMT DJV

"bad software deploy"

Aha, the old BOFH pad of random excuses!

"Internal teams are meeting as I write performing a full post-mortem to understand how this occurred and how we prevent this from ever occurring again."

...until the next time, that is!

12 0 Reply
1. Tuesday 2nd July 2019 18:51 GMT John 104
  
  Re: "bad software deploy"
  
  It's called testing. It requires a test lab. And not the dev's laptop. I've worked in IT for around 20 years and I've worked at 1 company that actually had a copy their production environment to test on. We never had a deployment failure. Not once. Everyone else just mangles together something in a half baked effort and then management screams bloody murder when a deployment goes sideways. This is, of course, after being told that spending on a proper lab would be ideal...
  
  7 1 Reply
  1. Tuesday 2nd July 2019 19:07 GMT IGotOut
    
    Re: "bad software deploy"
    
    I take it you understand the size of Cloudflare? That will be one hell of a test lab. And even then something as a slight different NIC can take it down. All the best laid plans etc
    
    12 0 Reply
    1. Tuesday 2nd July 2019 23:56 GMT Yet Another Anonymous coward
      
      Re: "bad software deploy"
      
      Couldn't they have tested on a spare internet?
      
      17 0 Reply
    2. Wednesday 3rd July 2019 08:34 GMT yoganmahew
      
      Re: "bad software deploy"
      
      Yeah, the whole 100% test coverage is spin from some consultant based on some academic with a toy network. It's complete spoof.
      
      1 0 Reply
  2. Tuesday 2nd July 2019 19:38 GMT Claptrap314
    
    Re: "bad software deploy"
    
    Yeah, I've worked at G. Studied some of the FB papers. When you do this stuff at scale, even when you do it right, human error happens. That includes when you try to figure out which human error can happen and what to do about them.
    
    I've also done microprocessor validation at AMD & IBM, so even if all of your code and processes are perfect, it is in fact possible (although HIGHLY unlikely) that the processor executing the code will itself have a different idea.
    
    So if you were for a period of time at a place that had good processes, and was small enough that no fails happened anyway, that's wonderful. But don't expect that experience to scale, because it does not.
    
    10 0 Reply
  3. Wednesday 3rd July 2019 06:53 GMT Potemkine!
    
    Shit happens
    
    Problem is with such a huge system you can't have a testing environment the same size than the production one, there's no InternetTest network.
    
    Of course, having a test lab is a very good thing and avoids a lot of problems, but there may still be real-life conditions that can't be emulated, it cannot be an absolute guarantee against failure.
    
    IT is so complex I even wonder why it doesn't fail more often ^^
    
    6 0 Reply
Tuesday 2nd July 2019 16:49 GMT chivo243

Scared

We were doing some switch upgrades today, and EL Reg didn't come up afterwards, then BAM 502! I was never so glad to see that message, we were done and packed up already!

11 0 Reply
1. Tuesday 2nd July 2019 19:39 GMT Claptrap314
  
  Re: Scared
  
  My first connect test is ping 1.1.1.1. Let's not insist that TCP is working right away...
  
  1 0 Reply
Tuesday 2nd July 2019 17:15 GMT Ken Moorhouse

fingering a colossal spike in CPU usage

Is it Patch Tuesday already?

3 0 Reply
Tuesday 2nd July 2019 17:48 GMT Anonymous South African Coward

Today's excuse:

Dynamic Programming Interrupt

Sounds feasible. Okay, let's do it.

2 0 Reply
Tuesday 2nd July 2019 18:27 GMT Blockchain commentard

Strange - when I couldn't get on el Reg, it said something about Cloudflare. Went onto another UK Cloudflare based site and it worked fine. Looks like they were probably just taking el Reg down.

1 0 Reply
Tuesday 2nd July 2019 18:30 GMT The Original Steve

Independence

Am I the only person who is a little uncomfortable about Cloudflare? Not just it's dominance in the market it plays in, but also that El Reg uses it.

I have nothing against them, and actually think they are a great company who have done some incredible innovation. I have no issue with them per se. But it just doesn't fit right to me that the mighty El Reg - who operate using open source (https://www.theregister.co.uk/about/company/website/) - have such a dependency on a commercial 3rd party.

Where does it end? The ethos of El Reg comes across to me as being fiercely independent which I like (they have cynicism for all IT vendors equally), but being so dependent on a sole provider just doesn't seem right. I'd like to think that they have half their servers in one colo, and their others in a different one, with different telco's (inc backhauls) supplying connectivity.

I know that they'll likely be dependent on lots of commercial 3rd parties (from hosting to water supplier) but the (valid) DDoS comment aside it's an optional choice to place your tin behind Cloudflare, not a technical necessity. Proudly declaring your technology stack which is all open source on your website just doesn't seem to fit with funneling every inbound packet over single for-profit 3rd party. Might as well use Microsoft/Oracle/IBM (urgh - I feel dirty even writing that) if you're going to give up any semblance of ownership and independence by slinging everything to a commercial 3rd party.

(I know that Cloudflare are also big users and contributors of OSS - it's not that I think it's proprietary - it just doesn't seem to fit with the independent nature of El Reg. I have a huge amount of respect for both organisations and wish them all the very best)

3 4 Reply
1. Tuesday 2nd July 2019 18:37 GMT Anonymous Coward
  
  Re: Independence
  
  Independent, not wealthy. And there is no opensource fiber.
  
  4 0 Reply
  1. This post has been deleted by its author
2. Tuesday 2nd July 2019 18:48 GMT Anonymous Coward
  
  Re: Independence
  
  Sometimes the IT that El Reg bites, bites back I guess.
  
  9 0 Reply
  1. Tuesday 2nd July 2019 23:53 GMT Anonymous Coward
    
    Re: Independence
    
    >Sometimes the IT that El Reg bites, bites back I guess.
    
    Nearly but perhaps..
    
    IT, biting the hand that feeds El Reg
    
    5 0 Reply
3. Tuesday 2nd July 2019 18:49 GMT Marco Fontani
  
  Re: Independence
  
  it just doesn't fit right to me that the mighty El Reg - who operate using open source [...] have such a dependency on a commercial 3rd party.
  
  We also have another hard dependency on a commercial third party in the form of the providers of the servers we use; same goes for the commercial third party OS installed in the load balancer, the firewall, etc. as well as other bits and pieces which there's either no free software or open source version available for, or for which it's infeasible to use one. I don't think it's avoidable much. Where should one stop? Organically in-house grown free BIOS-laden servers?
  
  DDoS comment aside it's an optional choice to place your tin behind Cloudflare, not a technical necessity
  
  Having a sorta kinda CDN in front of the infrastructure provides other technical tangible benefits. Substitute Cloudflare with Akamai or Fastly and it'd be kinda the same, modulo feature set. Should we hand-roll our own CDN? I strongly prefer not to, and I do like the fact I don't have to as there's a commercial service available which can do it for us. The only other alternative would be to not have one at all, and that'd be worse for us, even worse than having to hand-manage a home-rolled one.
  
  Unfortunately, as all things - sometimes things go TITSUP and there's not a lot we can do about it.
  
  At other times, some of our previous ISP's network went TITSUP - and there wasn't a lot we could do about it, either. We can control some things; just not all of them; or, if we can - it's probably too time consuming to control it down to the tiny bits.
  
  What we can and do control is what's running on our servers, and that's a fairly healthy mix of mostly free and open source software, with some commercial stuff peppered in-between.
  
  Just my 2c :)
  
  18 0 Reply
4. Tuesday 2nd July 2019 19:11 GMT IGotOut
  
  Re: Independence
  
  I've used Cloudflare on my little website and even with that I can see the advantages.
  
  Faster response times, less load on the server, huge savings on bandwidth, Geoblocking and on and on.
  
  They said what the fault was and they are trying to work out what happened.
  
  4 0 Reply
  1. Wednesday 3rd July 2019 06:56 GMT llaryllama
    
    Re: Independence
    
    Once you get into page rules and other features it's extremely powerful for the price. Most of the pages on our site are static so I set up page rules to cache them along with all the images, fonts etc. used by dynamic and static parts of the site. You can block or challenge visitors with lots of parameters to fine tune. Oh and you get brownie points with Google search rankings for having a fast site as well.
    
    1 0 Reply
Tuesday 2nd July 2019 20:00 GMT Ken Moorhouse

Next time...

...the residents of Guadalajara would appreciate a bit more warning.

https://www.bbc.co.uk/news/world-latin-america-48821306

1 0 Reply
Tuesday 2nd July 2019 20:10 GMT JohnFen

Single points of failure are bad

While I'm unaware of Cloudflare acting in an objectionable manner, the widespread use of Cloudflare has long caused me a great deal of nervousness, and this sort of thing is one of the reasons why.

I think it's a mistake for so much of the internet to be so centralized. It's a huge part of why the internet has become so brittle.

3 0 Reply
Tuesday 2nd July 2019 20:10 GMT Doctor Syntax

"the internet is a brittle thing"

When it was first designed it was supposed to be resilient, proof against chunks of infrastructure being taken out in a nuclear attack. "Routes round damage" was a popular term.

Somehow we forgot.

4 3 Reply
1. Tuesday 2nd July 2019 20:36 GMT Totally not a Cylon
  
  The internet is fine, it's just all the pages which are broken.
  
  A quick shufti at the page source is revealing.
  
  Back in the day we would hand code html to get the page and all images into a few KB to ensure fast loading on 300baud modems. It also made sites very resilient.
  
  Now it's all js script and dynamic pages with bits from ten's or hundred's of different sites; it only takes one of these to be Titsup to kill the original site. All because of 'metrics' and 'tracking'.
  
  13 0 Reply
  1. Wednesday 3rd July 2019 10:24 GMT defiler
    
    300 baud? HTML? You have a valid point without having to exaggerate. 9600bps was pretty mainstream when web pages first appeared, with 14400 available.
    
    You're right, though, in that I spent a lot of time hand-coding HTML, and squeezing every last byte it of images. Now bandwidth is plentiful so nobody cares. It's a choice between paying a human to optimise stuff Vs paying for bandwidth. Humans are expensive.
    
    Edit - you're definitely right about the tracking/metrics though!
    
    5 0 Reply
    1. Wednesday 3rd July 2019 15:18 GMT Blane Bramble
      
      Of course your (V32) 9600bps modem operated at 2400 baud.
      
      2 0 Reply
2. Tuesday 2nd July 2019 21:13 GMT jake
  
  "When it was first designed it was supposed to be resilient, proof against chunks of infrastructure being taken out in a nuclear attack."
  
  Oft repeated, but simply not true. The networks that were designed to survive nuclear attack included the "Minimum Essential Emergency Communications Network", or MEECN, and the prior "Survivable Low Frequency Communications System" or SLFCS, Besides, if you use an ounce of common sense, it only stands to reason ... no military would design a command and control system that inherently wasn't securable, and the Internet was not then, and still isn't, securable.
  
  In The Beginning, the first two nodes of what became TehIntraWebTubes were at SRI and UCLA, conceived, designed, implemented and run by students and professors. With no Pentagon oversight, input or anything else "intellectual". Money, yes. Oversight, no.
  
  Boiling it down to basics, the (D)ARPANET was just a research network designed to research networking. The "survives nukes" myth came about much later ... The only reason it was built to be resilient is because the existing hardware was really, really flaky.
  
  13 0 Reply
3. Wednesday 3rd July 2019 01:32 GMT doublelayer
  
  Even if we could magically decentralize CloudFlare and make people write nice HTML or at least store their own scripts, the internet wouldn't be a lot less fragile. The reason for that is that there are very few places that process all our traffic. There's only one line leading to your house that actually works, but that's a short length that isn't the main issue. The issue is that there's only one line that connects your ISP's local unit to whatever center they have for sending it out of local, and only a few lines (or maybe just one) connecting large areas to other large areas. What happens when cables stop working? Large parts of the internet lose connectivity. Routing around that kind of damage requires a web of lines, but a lot of the world operates on chains of lines instead. It's hopeless; the internet can't really route around damage. We just put our systems in lots of parts so we can weather most small disconnects and otherwise we're hoping nothing really bad happens.
  
  5 0 Reply
Tuesday 2nd July 2019 20:43 GMT Pirate Dave

You might know...

it fell over for me right as I clicked the link to go to the Comments section for the Deep Nudes story. At first I thought the company webfilter was gonna squeal about so many semi-naughty words on a page, then realized the 502 message was coming from Cloudflare. Phew, that was close...

Not that I read El Reg for fun at work. It's "Industry News", not leisure reading. Yeah...

8 0 Reply
1. Wednesday 3rd July 2019 11:54 GMT Anonymous Coward
  
  Re: You might know...
  
  Reading el Reg at work is staying abreast of industry trends, especially with articles like the Deepnudes one
  
  3 0 Reply
  1. Wednesday 3rd July 2019 12:24 GMT Anonymous Coward
    
    Re: You might know...
    
    abreast
    
    I see what you did there.
    
    4 0 Reply
    1. Wednesday 3rd July 2019 13:31 GMT Anonymous Coward
      
      Re: You might know...
      
      Shamelessly stolen from BOFH.
      
      1 0 Reply
Wednesday 3rd July 2019 03:26 GMT Jove

Cloudflare are not the only ones that have problems ...

... it appears that BT's services took a hit, at least for part of the country, earlier today.

1 0 Reply
1. Wednesday 3rd July 2019 10:00 GMT TimR
  
  Re: Cloudflare are not the only ones that have problems ...
  
  And irony of ironies, down detector was down for a while - in UK, at least...
  
  2 0 Reply
  1. Wednesday 3rd July 2019 10:09 GMT Anonymous Cowtard
    
    Re: Cloudflare are not the only ones that have problems ...
    
    https://downforeveryoneorjustme.com/downdetector.com
    
    2 1 Reply
Wednesday 3rd July 2019 07:00 GMT Potemkine!

Bookmarked

The cause of the problem is...

1 0 Reply
Wednesday 3rd July 2019 07:37 GMT fastmack

don't update software on live systems daft or what raymond petrie aberdeen scotland

1 1 Reply
1. Wednesday 3rd July 2019 08:38 GMT Alister
  
  don't update software on live systems
  
  Right, they should have updated the backup internet first
  
  1 1 Reply
Wednesday 3rd July 2019 08:39 GMT Aladdin Sane

I was dangerously close to having to do some actual work. Please don't let it happen again.

7 0 Reply
1. Wednesday 3rd July 2019 09:54 GMT phuzz
  
  Pretty much all our customers use Cloudflare now, so after answering the panicked phone calls ("nope, it's cloudflare, nothing we can do, sorry"), it was pretty much beer-o-clock until the sods got it back running again.
  
  2 0 Reply
Wednesday 3rd July 2019 20:39 GMT fredesmite

Remember - Cloud computing

Is nothing more than some provider let you run your crap on their machines that other people are using at the same time.

2 0 Reply

POST COMMENT House rules

Not a member of The Register? Create a new account here.

Topics

Special Features

Vendor Voice

Resources

COMMENTS

Is El Reg

Re: Is El Reg

'knocking Cloudflare over'

Re: 'knocking Cloudflare over'

Re: 'knocking Cloudflare over'

Re: Is El Reg

Re: Is El Reg

Gin or Vodka?

"falling over" or "fell over" ...

Re: "falling over" or "fell over" ...

Re: "falling over" or "fell over" ...

Re: "falling over" or "fell over" ...

Re: "falling over" or "fell over" ...

502

All eggs one basket?

Re: All eggs one basket?

Re: All eggs one basket?

Old man shouts at cloud

"bad software deploy"

Re: "bad software deploy"

Re: "bad software deploy"

Re: "bad software deploy"

Re: "bad software deploy"

Re: "bad software deploy"

Shit happens

Scared

Re: Scared

fingering a colossal spike in CPU usage

Independence

Re: Independence

Re: Independence

Re: Independence

Re: Independence

Re: Independence

Re: Independence

Next time...

Single points of failure are bad

You might know...

Re: You might know...

Re: You might know...

Re: You might know...

Cloudflare are not the only ones that have problems ...

Re: Cloudflare are not the only ones that have problems ...

Re: Cloudflare are not the only ones that have problems ...

Bookmarked

Remember - Cloud computing

POST COMMENT House rules

Enter your comment

Add an icon

Other stories you might like

911 goes MIA across multiple US states, cause unclear

Sacramento airport goes no-fly after AT&T internet cable snipped

US-EAST-1 region is not the cloudy crock it's made out to be, claims AWS EC2 boss

Cyberattack hits Omni Hotels systems, taking out bookings, payments, door locks

Datacenter outages are on the decline, but when they hit, they hit hard

Cloudflare says it has automated empathy to avoid fixing flaky hardware too often

Tech trade union confirms cyberattack behind IT, email outage

McDonald's ordering system suffers McFlurry of tech troubles

Cloudflare wants to put a firewall in front of your LLM

LinkedIn's turn to fall over: Outage hits thinkfluencer hub

World-plus-dog booted out of Facebook, Instagram, Threads

AT&T's apology for Thursday's outage should stretch to a cup of coffee

About Us

Our Websites

Your Privacy