back to article What a pain in the Azzz-ure: Microsoft Azure, SharePoint, etc knocked offline by DNS blunder

For at least the past hour or two, Microsoft's Azure cloud has been up and down globally due to a DNS configuration mishap. The platform-wide outage has knackered all sorts of Redmond-hosted systems around the world, from Azure SQL databases and App Services to multi-factor authentication, Microsoft 365 and Teams, Dynamics, …

  1. Mattknz1
    Mushroom

    Woohooo!

    "Team building" day with the BBQ and the Xbox it is.

    1. TXITMAN

      Re: Woohooo!

      XBOX on-line is down too.

  2. Michael Hoffmann Silver badge
    Trollface

    This will be fun...

    Heading to the train in 10 minutes, always ride with a manager of a company who went full-on Azure ("we want the stability of Microsoft" he always says, "we have no truck with some company that started out flogging books" he always says).

    The morning coffee will taste extra delicious this morning!

    1. N2

      Re: This will be fun...

      "We want the stability of Microsoft"

      That made me smile.

  3. Blockchain commentard

    DNS - don't (k)no(w) stuff - like how reliable it needs to be!!!

  4. Moeluk

    Thanks for scaring the crap out of me on a Thursday evening Microsoft!

  5. DaleWV

    And they want us to put business their way. Over three years with AWS and just one outage. Less than 6 months with Azure and.....

    How can they seriously expect anybody who cares about uptime to recommend them.

    1. StaudN

      Hmmm, spot the Amazon employee. (They're currently experiencing issues too https://downdetector.com/status/aws-amazon-web-services )

      1. MatthewSt

        Maybe they host their cloud on Azure...

      2. Claptrap314 Silver badge

        Hmmm, spot the u$ shill... Down detector shows 1/10th of the number of complaints, with a rise & peak trailing those for Azure. Can you say splash damage?

        Seriously, I don't consider either of these comments to have any value.

    2. Anonymous Coward
      Anonymous Coward

      Ah AWS.

      Isn't that the service that makes it so easy to get up and going but costs you at least two legs and one arm to get all that lovely data out?

      Azure isn't much better with it's record amounts of TITSUP in the past year.

      Yet PHB's the world over are still falling for this cloudy 'snake oil'... {shakes head in disbelief}

      Suddenly on premises starts to look a whole lot better. At least there, you are in control (local JCB Operators permitting naturally)

    3. TeeCee Gold badge
      Meh

      Hmm, wasn't it the Amazon one where they were eventually forced to turn the whole thing off in a region (Western USA????) to stop their self-replicating fuckup taking over teh hole wurld....?

      Of course, that was after their poor bloody customers had spent the thick end of two days living with it limping toward its final death....

      Cockups are a given. It's how you deal with one that's key.

  6. Claptrap314 Silver badge

    So, their status continues to be complete BS. We had customer complaints validated by our L1 & referred to our team by 1931 UTC. 1943 UTC indeed.

  7. Mr Sceptical
    Facepalm

    So when,

    Will this feature in an episode of Who, Me?

    1. Olivier2553

      Re: So when,

      That would need for MS to own on their blunder...

  8. John70
    Joke

    Cloud Service - OaaS

    Are they trying out a new cloud service? Offline as a Service

  9. Hans 1
    Joke

    Note that we are unsure if this really was a DNS issue, you know, because a Window Cleaner and Surface Expert will always claim a network outage is a DNS problem, why ? Because IE tells him so!

    1. phuzz Silver badge
      Gimp

      To be fair, problems in Windows networks always seem to turn out to be caused by DNS errors.

  10. Zippy´s Sausage Factory
    Devil

    The real surprise to me here is that SharePoint is still a thing and people still use it. I thought it was about as popular as Lotus Notes these days...

    1. johnnyblaze

      A lot of companies have Sharepoint, because MS told them it was going to be the next big thing. It wasn't, and companies paid loads for it. It's a pig to develop for too, isn't user friendly and is way too complex for what it actually does. There are better alternatives for a lot less money.

      1. MJB7

        Re: A lot of companies have Sharepoint

        The interesting thing is that all of your explanation fits Notes too (except you need to replace "MS" with "Lotus and then IBM").

  11. steviebuk Silver badge

    So thats why mail

    Was titting about earlier and yet their "health" status page said nothing was wrong.

  12. Erik4872

    Ouch

    All these PaaS and SaaS services, including the ones Microsoft uses to run Azure, run on real machines _somewhere_ and are subject to real-world on-prem style failures. Something foundational like DNS is horrible to lose because you basically have no way to get to anything to even start fixing the problem. Developers are used to all their abstractions and basically don't have any idea what to do when their call to a hostname fails. Azure AD would be another one...imagine not being able to even log in to systems to start troubleshooting without using some sort of emergency break-glass kind of access.

    I'm not interested in 100 hour weeks, but if that weren't a requirement I'd love to work for one of the cloud providers. The systems they have in place to keep that massive tower of abstraction running must be amazing. But yeah, if you lose DNS your best bet is to get it back immediately.

  13. David Austin

    If in Doubt, blame DNS

    I find that a normal good troubleshooting tip: For such a simple thing, it was so many varied and exciting ways to bugger up your network.

    It's OK; I'm sure going to IPv6 where the recommendation is to always use DNS instead of Direct IP calls will make this mess a whole lot easier...

    1. N2

      Re: If in Doubt, blame DNS

      Oh shit, you found me out, that was my excuse too!

      And of course IPV6 will solve all your woes...

  14. The Oncoming Scorn Silver badge
    Pirate

    and anything could happen in the next half hour, as they say.

    Thank you for putting the bongos back in my head...

    https://www.youtube.com/watch?v=E06cNv55jTs.

  15. Ken Moorhouse Silver badge

    the availability of Azure DNS remained at 100% throughout the incident

    Isn't that like saying the Underground is running a full service, but all of the stations are closed?

  16. Nick Kew

    Outlook.com?

    Could that account for my inability to send email to an NHS address on April 29th? Or indeed the message that took about 24 hours to reach me from a friend on May 2nd/3rd, with most of that time spent on outlook.com servers?

    The bounce from the NHS mail contained full diagnostic information. It was in a mail loop at outlook.com:

    Received: from AM6PR10CA0087.EURPRD10.PROD.OUTLOOK.COM (2603:10a6:209:8c::28) by AM5SPR01MB03.EURPRD10.PROD.OUTLOOK.COM (2603:10a6:206:1b::28) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.1835.13; Mon, 29 Apr 2019 14:44:48 +0000

    Received: from HE1EUR02FT049.eop-EUR02.prod.protection.outlook.com (2a01:111:f400:7e05::207) by AM6PR10CA0087.outlook.office365.com (2603:10a6:209:8c::28) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.20.1835.12 via Frontend Transport; Mon, 29 Apr 2019 14:44:48 +0000

    Received: from EUR04-HE1-obe.outbound.protection.outlook.com (104.47.13.51) by HE1EUR02FT049.mail.protection.outlook.com (10.152.11.8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.20.1835.13 via Frontend Transport; Mon, 29 Apr 2019 14:44:48 +0000

    Received: from DB6PR1001CA0034.EURPRD10.PROD.OUTLOOK.COM (2603:10a6:4:55::20) by VE1PR10MB2879.EURPRD10.PROD.OUTLOOK.COM (2603:10a6:803:10f::28) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.1835.15; Mon, 29 Apr 2019 14:44:46 +0000

    Received: from AM5EUR02FT030.eop-EUR02.prod.protection.outlook.com (2a01:111:f400:7e1e::209) by DB6PR1001CA0034.outlook.office365.com (2603:10a6:4:55::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.20.1835.12 via Frontend Transport; Mon, 29 Apr 2019 14:44:46 +0000

    Received: from EUR03-VE1-obe.outbound.protection.outlook.com (104.47.9.54) by AM5EUR02FT030.mail.protection.outlook.com (10.152.8.180) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.20.1835.13 via Frontend Transport; Mon, 29 Apr 2019 14:44:46 +0000

    Received: from AM6PR10CA0049.EURPRD10.PROD.OUTLOOK.COM (2603:10a6:209:80::26) by HE1PR10MB1548.EURPRD10.PROD.OUTLOOK.COM (2603:10a6:7:5d::25) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.1835.13; Mon, 29 Apr 2019 14:44:44 +0000

    Received: from VE1EUR02FT014.eop-EUR02.prod.protection.outlook.com (2a01:111:f400:7e06::208) by AM6PR10CA0049.outlook.office365.com (2603:10a6:209:80::26) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.20.1835.12 via Frontend Transport; Mon, 29 Apr 2019 14:44:44 +0000

    1. Ken Moorhouse Silver badge

      Re: Outlook.com?

      They were testing their AI algorithms and thought Kew meant Queue, and therefore repeatedly added the message to the end of its processing queue.

      1. Nick Kew
        Facepalm

        Re: Outlook.com?

        Pfft.

        So does it get confused whether to route your mail to somewhere on our canals or to Wuthering Heights?

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like