back to article Microsoft's Azure West Europe region blew away in freak summer storm

A freak summer storm in the Netherlands is being blamed for causing network issues in Microsoft's Azure West Europe region last week, according to a preliminary post-incident review by the company. The weather event, named Storm Poly, has been described as the strongest summer storm in the country's records. It hit the …

  1. Mike 137 Silver badge

    Wot, no redundancy?

    "Azure's West Europe region is outfitted with four independent fiber paths for traffic flows between datacenters. With one severed, a quarter of the network bandwidth between two campuses of West Europe datacenters was unavailable"

    The most basic principle of the internet is (or at least used to be) ability to reroute traffic in the face of link failure. It seems they're not using internetworking, just point to point links, so the result for the user is actually darned fragile.

    1. Anonymous Coward
      Anonymous Coward

      Re: Wot, no redundancy?

      You're connecting data centres into an availability zone so latency is key - a point-to-point link is your best option.

      In terms of performance, Azure reportedly provisioned 1.6Pbps between DCs in an availability zone in their 2017 designs (https://azure.microsoft.com/en-us/blog/how-microsoft-builds-its-fast-and-reliable-global-network/). Capacity issues meant they were upgrading this assuming this was the design used for this region.

      Can you elaborate on how you think this design might have been improved by using routes via the Internet?

      1. john.jones.name
        Mushroom

        Fibre underground is not effected by storms they strung it on poles...

        I would bet they cheaped out and used aerial Fibre run

        microsoft will have known this and didnt care thinking the other links would be fine until bang they didnt have capacity to balance a failure

        1.6Pbps is the addition of all links its practically nothing if you have a lot of DC's with 100Gbps links

        maybe just maybe they should have located "west" in more countries and not just the cheapest bandwidth wise...

        data and network sovereignty have you heard of it ?

        1. plunet

          Re: Fibre underground is not effected by storms they strung it on poles...

          Would not necessarily need to be an arial fibre run for bad weather and assoicated power failures to still cause impacts to fibre below ground.

          1. PRR Silver badge

            Re: Fibre underground is not effected by storms they strung it on poles...

            > Would not necessarily need to be an arial fibre run for bad weather and assoicated power failures to still cause impacts to fibre below ground.

            A decade ago, hurricane Irene flooded lower Vermont. The internet in the next three states northeast went down hard for a couple days.

            In the aftermath they claimed a splice manhole flooded. And there was a redundant connection but that was never actually connected.

            (Interesting we can run three states on about a quarter of MS Holland's data needs.)

            I didn't know fiber could not stand water but I wasn't there. Maybe there was a repeater/hub in the hole. Maybe muck came in and infiltrated the splice connectors. Maybe the whole story was moose-poop.

            I can see why repairs had to wait for flood to subside. Look at online news pictures of this week's rain-dump in NY and VT. Roads washed out, trees and logs rushing downstream. The VT Governor couldn't drive to work cuz all his roads are flooded (he walked through the woods to reach emergency management HQ). Whether underground or overhead, there's gonna be a lot of drainage and road-fill before wire-work happens.

            And underground service is affected by backhoes/JCBs. Where I worked, the city had two power feeds, but one was cut during slum removal. We work several years on very low voltage until they could get their act together and run a second cable. (Somebody who didn't live/work in the city decided not to run a temporary repair but wait for the Long-Term Plan to ripen.)

          2. Avalanche

            Re: Fibre underground is not effected by storms they strung it on poles...

            At least near Amsterdam a lot of trees were uprooted by the storm, this can easily rip out fiber lines and other utility lines close to a tree.

  2. Anonymous Coward
    Alien

    Freak summer storm in the Netherlands

    Who would ever have thought there would be a random excursion in atmospheric phenomena and that a fiber optic connection could be severed by this.

    > According to the preliminary post-incident review, Azure's West Europe region is outfitted with four independent fiber paths for traffic flows between datacenters. With one severed, a quarter of the network bandwidth between two campuses of West Europe datacenters was unavailable.

    Then the fibre network, if one link was down, should have been designed to carry the entire traffic. Given this carried Azures entire West Europe traffic they should have designed at least sixteen independent nodes in a spanning tree configuration that would auto-reconfig in the event of one or more nodes going down. Usability to be maintained down to five nodes.

    > There was apparently a capacity upgrade project already in progress to address this when the incident occurred, Microsoft states.

    Retrospective weasel words.

  3. Doctor Syntax Silver badge

    So the fiber went fubar.

    1. ecofeco Silver badge

      Azuridly.

  4. ecofeco Silver badge

    So that's what happened!

    Wondering was what going on. Between this and their other cock-up last week, it all makes sense now.

    Our global network was seeing some... latency issues.To put it mildly.

  5. Anonymous Coward
    Anonymous Coward

    Never noticed

    We are "hosted" in Western Europe and didn't notice anything (and neither did the customers) so that's good.

    1. plunet

      Re: Never noticed

      But as the article says, traffic to/from the front door of the region wasn't impacted. Only traffic being synced between DCs within the region for availability purposes was impacted. Perhaps you don't have any availability services, or you failed to notice that background syncs of data were failing.

  6. Vince

    Maybe they should just buy capacity in 'the cloud' as we all know, using the cloud means capacity is never an issue, and you can have more of anything you need anytime....

    1. breakfast

      Only now do they tell us the cloud can be affected if there are too many clouds. Fortunately that's the only way things can go wrong. Except if things get unplugged or switched off and on wrong or there's a problem with a policy update or someone else's cloud goes down and somehow takes a critical piece of your infrastructure with it.

      Honestly sometimes it feels like the cloud is just somebody else's computer.

  7. Anonymous Coward
    Anonymous Coward

    Got lost in the rename. s/azure/entra/

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like