back to article BGP super-blunder: How Verizon today sparked a 'cascading catastrophic failure' that knackered Cloudflare, Amazon, etc

Verizon sent a big chunk of the internet down a black hole this morning – and caused outages at Cloudflare, Facebook, Amazon, and others – after it wrongly accepted a network misconfiguration from a small ISP in Pennsylvania, USA. For nearly three hours, web traffic that was supposed to go to some of the biggest names online …

  1. This post has been deleted by its author

  2. Anonymous Coward
    Anonymous Coward

    We could always just...

    tag all free peer and customer transit routes with "no-export". That would reduce the damage small peers could do.

    1. Tom Samplonius

      Re: We could always just...

      "tag all free peer and customer transit routes with "no-export". That would reduce the damage small peers could do."

      That is not necessarily effective. All of networks strip all communities, including NO_EXPORT. It is pretty easy to configure a router to do that, which makes it pretty easy to do it accidentally.

      1. Jellied Eel Silver badge

        Re: We could always just...

        It also wouldn't necessarily work. So no-export means the route wouldn't be exported to any of the peer AS's (e)BGP neighbours, so networks beyond Verizon's AS would be unreachable. Or large parts of Verizon given regional ASs. And like Tom says, communities rely on the upstream to act on community tags. They're generally used downstream to give customers some choices wrt traffic engineering.

        The best approach is to enforce the use of route registries, and implement the correct filters to limit the damage.

    2. MyffyW Silver badge

      Re: We could always just...

      Maybe we could have an idiocy bit, building on the successful work of RFC 3514?

      1. dbtx

        +1

        I suppose hardware designers could assign both bits to the same position in Hanlon's Register.

  3. Jellied Eel Silver badge

    RIR should be your friend

    Disclosure: Ex-Verizon

    So I think best practice for signing up a BGP customer starts with the routes they want to advertise. Then making sure there's a route object in the relevant registry showing the routes that will be advertised by the customer AS and an origin showing the transit provider(s). In AS702-land, that used to be enforced, but sometimes required a bit of handholding and help to get the correct routing registry data filled out. Then when the session's configured at the provider end, there should be a max prefix set on the BGP peering config, and a filter to only accept the routes assigned to the customer.

    In theory, that should stop these scenarios with either the prefix limit kicking in, or the route filters rejecting non-customer routes. In practice, it can be a bit of a ballache, especially in ARIN land where swamp routes often don't have route objects defined, or maintainers have long gone AWOL. RIPE users tend to be better behaved, and there's a bunch of software that will build filters based on the RIR data... But I sure as hell wouldn't trust a 'BGP optimiser' to do it for me, especially in a network as complex as Verizons.

  4. Blockchain commentard Silver badge
    Coat

    I remember, years ago, the internet was designed to stop people taking a big chunk out and stuffing it up for everyone else. Ah, halcyon days.

    Mine's the one with your AS route in it !!!!!

    1. Alan Brown Silver badge

      "I remember, years ago, the internet was designed to stop people taking a big chunk out and stuffing it up for everyone else."

      Years ago, the first octet of an IPv4 was supposed to represent the destination network and the second, the location inside that network.

      Things grew, numberspace got crowded and that tidy setup got obliterated. IPv6 is _big_ and _sparse_ precisely to allow the tidy setup to be maintained without panicking and stuffing numbers into every available gap.

  5. Yet Another Anonymous coward Silver badge

    I own the internet

    and so does my wife ....

  6. cabac

    You effing.....

    .......w*kers, stop pratting about and implement RPKI ROA validation. Follow the lead of the likes of AT&T and numerous other networks who understand how important the DFZ is as a shared worldwide resource and stop taking it for granted. There is no excuse for inaction on this, pull your fingers out your arse Verizon (and many others!)

    1. Anonymous Coward
      Anonymous Coward

      Re: You effing.....

      > .... w*kers

      Um. Wokers? Wakers? Wikers?

      1. Michael Wojcik Silver badge

        Re: You effing.....

        Wokers? Wakers? Wikers?

        Depends whether it's sh-wildcard or regular-expression syntax.

  7. Anonymous Coward
    Anonymous Coward

    whaaaaat, no...

    It cant be an American company that did this, must have been China, who else would do this kind of thing. Must have been China Telecom, who else, not Verizon.

    1. vtcodger Silver badge

      Re: whaaaaat, no...

      I believe that Iran, not China, is the current enemy of the week and has been for at least 72 hours.

      1. Rich 11 Silver badge

        Re: whaaaaat, no...

        No, no, no. Iran has always been the current enemy. China has always been the current friend.

        (I could make a joke about Trump and memory holes, but it would be insulting to goldfish and the poor little orange swimmy buggers don't deserve that.)

        1. Kiwi Silver badge
          Trollface

          Re: whaaaaat, no...

          and the poor little orange swimmy buggers don't deserve that

          VS the orange slimy bugger?...

        2. DCFusor Silver badge

          Re: whaaaaat, no...

          We've always been at war with EastAsia.

          1. MrReynolds2U
            Pint

            Re: whaaaaat, no...

            nice reference... have a pint on me :)

        3. Anne-Lise Pasch

          Re: whaaaaat, no...

          The bit that scares me is that I actually think Donald Trump can actively edit his memory to believe what he says, hence Iran is his enemy *and always has been*.

      2. phuzz Silver badge
        Trollface

        Re: whaaaaat, no...

        "I believe that Iran [...] is the current enemy of the week"

        The phrase you're looking for is: Iran is America's 'Great Satan'.

        1. Kiwi Silver badge
          Trollface

          Re: whaaaaat, no...

          The phrase you're looking for is: Iran is America's 'Great Satan'.

          Pretty sure Trump is US's 'great satan'. Especially if you consider 'satan' means 'accuser'.

          Perhaps in the run up to the next election he can can change his catch-phrase a little and make it more accurate...

          "Make America 'Great Satan' again!"

    2. MyffyW Silver badge

      Re: whaaaaat, no...

      Everybody knows it was Emmanuel Goldstein. Or maybe Snowball....

    3. Anonymous Coward
      Anonymous Coward

      Re: whaaaaat, no...

      > It cant be an American company that did this

      I'm sure they have at least one Huawei router somewhere within Verizon that can be blamed

    4. RegGuy1 Silver badge
      Coat

      Re: whaaaaat, no...

      Huawei are you listening?

      You don't need your kit in the US. Just change the routing table and get all the traffic to come to you.

      Spying at its best -- all controlled from one's laptop. :-)

  8. HildyJ Silver badge
    Devil

    How long before they blame . . .

    "Sources say this was the result of a cyberattack by:"

    1) Iran

    2) Russia

    3) China

    4) Huawei

    Place your bets now. Bonus for predicting when it's announced.

    1. John G Imrie

      Re: How long before they blame . . .

      You missed out

      5) All of the above

    2. This post has been deleted by its author

  9. IGotOut Silver badge

    So....

    If the traffic goes down and happens to b erouted to a Chinese ISP, it's state level hacking.

    If a US company reroutes a stack of traffic taking down some of the largest online businesses around, it's a cock up.

    Gotcha.

    1. John Brown (no body) Silver badge

      Re: So....

      Yeah, but not to worry. Verizon say it only affected a few of their own FioS customers so the rest of the world needen't worry. </sarc>

    2. Psmo Silver badge
      Headmaster

      Re: So....

      It's an irregular verb:

      I made an unavoidable error

      You cocked up

      He/She/It should be extradited

      We are just human dammit

      You are criminals that should be locked up

      They are responsible for state-level hacking

      </apologies_to_yes_minister>

  10. Doctor Syntax Silver badge

    https://xkcd.com/908/

    OK, you all know which it is.

  11. Anonymous Coward
    Anonymous Coward

    an "accident"... sure it was...

    Nothing to do with a ruskie plot to run all that traffic through their own dodgy, dangerous Huawei snooping gear, right back to the red communists.

    /s

    1. Anonymous Coward
      Anonymous Coward

      No it was an attempt at making American steal a wanted product, buy our steel we have your net!

  12. Will Godfrey Silver badge
    FAIL

    Response

    Verizon "take out customer's safety very seriously"

    and "Lessons will be learned"

    Also "Measures have been taken to ensure this cannot happen again"

    Yeah. Right. Oh, and I have some magic beans and a flying pig you might be interested in.

    1. Sir Runcible Spoon Silver badge

      Re: Response

      Unless someone creates a 'project' and associated billing codes/budget to go with it, nothing gets done to fix BAU issues in large companies like this.

      Don't ask me how I know, I couldn't possibly comment.

      1. Alan Brown Silver badge

        Re: Response

        "Unless someone creates a 'project' and associated billing codes/budget to go with it, nothing gets done to fix BAU issues in large companies like this."

        Or in other words, people should start billing Verizon for hijacking their routes - and then they _might_ start paying attention.

    2. thosrtanner
      Coat

      Re: Response

      To be fair, the magic beans worked just fine for young Jack

      1. Anonymous Coward
        Anonymous Coward

        Re: Response

        While it may have worked OK for Jack, Golden Geese remain a critically endangered species if they aren't already extinct...

      2. Anonymous Coward
        Anonymous Coward

        Re: Response

        > To be fair, the magic beans worked just fine for young Jack

        Jack is a moronically stupid (beans for a cow?), home-invading thief who does not think twice about stealing from a gentle giant; a persecuted minority, carefully minding his own business far away in the clouds, not harming anyone. And then, when the giant has the timidity to try and recover his own property, Jack saves his own skin by killing the giant and then engages in victim blaming by concocting a fairy story in which he is somehow the hero.

        This week on the Jeremy Kyle show, we put Jack face-to-face with the Giant's wife to get her point of view.

        ;-)

        1. Will Godfrey Silver badge
          Thumb Up

          Re: Response

          Ecellent!

    3. fidodogbreath Silver badge

      Re: Response

      Verizon "take our customers' safety very seriously"

      They say that about customer privacy, too, with similar results.

  13. Mage Silver badge
    Flame

    Only when not if

    With auto updates and potato like mono-cultures of OSes, how long before a couple of Friday evening patch releases take the entire Internet down for a week?

    The problem is outsourcing and OVER RELIANCE on the Cloud.

    Accidents will happen, more likely than a Cyber Pearl Harbour!

    A BGP patch bug is suggested for one of the two reasons for failure of Internet in 'No Silver Lining' by Ray McCarthy

    “If an in-house system fails, only one bank, or one retailer or one supplier is affected,” insisted Louise. “If everything is outsourced to the Cloud, even if it’s a hundred times more reliable it’s an apocalyptically bad event because you lose everything at once. There are too few cloud providers, who are too similar and too big.”

    Once all retail POS, Wholesale re-ordering, services/Mobile billing etc is outsourced to the "Cloud" events like this will be more severe.

    1. Archtech Silver badge

      Re: Only when not if

      I thought it was Facebook that was a PoS...

  14. Kiwi Silver badge
    Pint

    A debt of gratitude is owed...

    "...caused outages at Cloudflare, Facebook, Amazon, and others..."

    Someone is owed one hell of a lot of beers! Now if you could've added Google and Bing, maybe that other bunch of Yahoos...

    Very well done.

    Oh - and those who clean up the messes as quickly and quietly as possible? Yeah we owe you big time as well (though next time this lot goes down maybe you can put the boot in a few times before you help them up? Perhaps 'accidentally' trip while trying to get them on their feet and drop a knee to the nutsack or something?)

  15. Anonymous Coward
    Anonymous Coward

    It was the Chinese Government together with the Norks and Iranians trying to Silence the cloudflare customer "TheRegister" from spreading the truth about internet.

    1. tip pc Silver badge

      That is fake news

      1. Nick Ryan Silver badge

        That is fake news

        No, this is fake news.

        1. Loyal Commenter Silver badge

          Not just fake news, but M&S fake news...

    2. Psmo Silver badge
      Holmes

      <koan>

      if the internet is down, is a troll still a troll ?

      </koan>

      1. Nick Ryan Silver badge

        if the internet is down, is a troll still a troll ?

        It depends if the bridges are still up or not.

  16. Anonymous South African Coward Silver badge

    Ah, so my initial thoughts of a misconfigured BGP was both wrong and correct.

    Wrong for thinking it was the [REDACTED] or [REDACTED] trying to slip in some [REDACTED, REDACTED and REDACTED] gear without anybody noticing.

    Correct for misconfigured BGP.

  17. Anonymous Coward
    Anonymous Coward

    Facebook

    NOOOOOOOOOOOOOOOO

    a billion voices cried out

    1. Psmo Silver badge
      Mushroom

      Re: Facebook

      and complained on Twitter.

      And nothing of value was lost.

      1. Fred Flintstone Gold badge
        Thumb Up

        Re: Facebook

        And nothing of value was lost.

        Need. more. upvotes..

        :)

    2. hmv Silver badge

      Re: Facebook

      No, they tried to cry out but Facebook was down so nobody heard their screams.

  18. Anonymous Coward
    Facepalm

    Corporate truism...

    That the set of people who know exactly how things are supposed to work, and can keep them working is:

    1. Entirely unrelated to the size of the corporation or the amount of money that corporation has invested in the latest technology and/or processes.

    2. Small.

    3. Undervalued.

    You will notice that there is never any key-man risk in HR or marketing.

  19. ElReg!comments!Pierre

    Oh, that would be why one of our customers had trouble accessing their IBM Cloud VMs and kept bugging us !

    1. A.P. Veening Silver badge

      Now you know where to send the invoice (and forward your customer's).

  20. This post has been deleted by its author

  21. steviebuk Silver badge

    So...

    ...would it be possible (as my network is mega basic) for a 'bad actor' to break into a local, small ISP to cause this. Then take the data dump from the small ISP? To make it all look like an internal fuck up instead of a state sponsored attack?

    Just curious. I have no doubt this is an internal, American cockup, but would it also be possible to do the above? Be a sort of, slight of hand/misdirection type of hack?

    1. Anonymous Coward
      Anonymous Coward

      Re: So...

      Surely the ISP's employees would be suspicious if someone mangled the routing and nobody would be responsible. Wasn't even in that day. Haven't seen the router in years, guv, honest!

      It's not as obviously malicious as redirecting Amazon's Cloud DNS and then grabbing the credentials from anyone trying to open their Cloud Crypto Wallet (TheRegister reported).

      The global phone system is not in a better shape, either. Anyone with access to the network can redirect calls to mostly any number worldwide through their own equipment.

      At least, BGP changes are monitored and logged by several institutions and traffic redirections can be investigated after the fact.

  22. Anonymous Coward
    Trollface

    Obviously...

    I blame Brexit.

    1. Anonymous Coward
      Anonymous Coward

      Re: Obviously...

      Well, granted many people are finding themselves badly rooted.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Biting the hand that feeds IT © 1998–2020