back to article Microsoft admits Azure Resource Manager failed after code change

Any insomniacs, workaholics or those pulling an all-nighter related to a past deadline project may have noted a four-and-a-half hour failure of Azure Resource Manager in Europe this morning following a recent code change. Should this have happened later in the day, admins would have pulled their hair in frustration over the …

  1. DJV Silver badge

    ARM, as Azure customers will know, is a deployment and management service...

    Damn, and there was me thinking it was a CPU invented by Acorn back in the 1980s!

    1. NoneSuch Silver badge
      Trollface

      Re: ARM, as Azure customers will know, is a deployment and management service...

      "This issue was the result of an interaction between a recent code change in West Europe which introduced a subtle performance regression, and a specific internal Azure workload isn West Europe which exercised this performance regression in a manner which resulted in significant resource saturation."

      Translation - One of our engineers tried to download the P*rnhub back catalog using a Powershell script.

      1. Fred Daggy Silver badge
        Alien

        Re: ARM, as Azure customers will know, is a deployment and management service...

        I think Sir Humphrey, having shuffled off this moral coil, and been reborn an is writing preliminary outage reports. It's the type of jargon filled writing that takes a lot of words to say nothing that he loved.

        1. cookieMonster Silver badge

          Re: ARM, as Azure customers will know, is a deployment and management service...

          Or maybe they are using ChatGPT??

      2. David 132 Silver badge
        Coat

        Re: ARM, as Azure customers will know, is a deployment and management service...

        >One of our engineers tried to download the P*rnhub back catalog using a Powershell script.

        Well, that would certainly explain the cock up.

  2. Mike 137 Silver badge

    Second time around?

    First time (the leap year one) a code bug hit the entire world wide service. Now this. Seems like testing needs to be beefed up.

    1. original_rwg
      Thumb Up

      Re: Second time around?

      "Testing needs to be beefed up."

      That made me laugh out loud.

    2. Steve Davies 3 Silver badge

      Re: Test

      Is apparently a four-letter word that is banned in the Microsoft Universe.

      1. Anonymous Coward
        Anonymous Coward

        Re: Test

        It seems to be a dirty word in many places.

        Jira, for example, doesn't support the idea that issues need testing before they can be closed.

        It compiled, ship it!

      2. Excused Boots Bronze badge

        Re: Test

        Now although I do, absolutely get what you are saying, but every so slightly in MS's defence (defense?) here, these systems are now so large and complex that is arguable if it is even possible to test this sort of thing prior to rolling it out?

        Maybe that actually is the test, push out the updates, is it all working? If yes then breath a sigh of relief and rinse and repeat for the next time - but if it all suddenly breaks catastrophically, then, it's a case of 'oh bugger, right let's roll it all back and try again later!'

        One day, MS will push out an update which has an unexpected domino effect and brings the entire global M365 system crashing down, won't (probably) be tomorrow, or next month or next year - but one day it is inevitable. Now let's assume that MS have actually mitigated against this and are in a position to roll back any changes and restore order. But there is a time lag for this, the update is pushed out, it take a while for the effects to become obvious, what initially looks like a local issue which can be dealt with, grows to encompass whole regions and zones and eventually someone senior at MS with enough clout, basically makes the call; 'oh shit, roll it all back, now!'

        And of course, this all takes time, so in the meantime, what is the cost of potential lost business to companies?

        1. Anonymous Coward
          Anonymous Coward

          Re: Test

          "Now although I do, absolutely get what you are saying, but every so slightly in MS's defence (defense?) here, these systems are now so large and complex that is arguable if it is even possible to test this sort of thing prior to rolling it out?"

          Maybe, but even AWS manages better uptimes than Azure, as does GCP (which appears to be the most reliable of the three). Which is no surprise, considering that Azure's software stack appears to be build completely on sand, using toilet roll cores and chewing gum. And it's not that Microsoft's on-premises software has any better track record in terms of reliability, so it's unlikely that "too complex to fix" is the issue here (especially when considering that MS regularly fails at fixing problems in its offline software, too)

          It's completely mind-boggling to think that any business would voluntarily base it's critical infrastructure on Azure, especially considering the increasingly hefty ransoms Microsoft wants to be paid for the privilege of not caring about uptimes or security of its tenants. But yet here we are.

  3. Pascal Monett Silver badge

    "Gotta love the cloud"

    Well it's not like we have much of a choice, now is it ?

    Every supplier is is pushing or dragging us there, and kicking and screaming about it doesn't seem to work much. Eh, Adobe users ?

  4. MatthewSt

    Sic?

    "Between 02.41 UTC and 07.10 UTC on 23 Mar (sic) 2023"

    Is "Mar" no longer the three letter abbreviation for March?

    1. Zippy´s Sausage Factory

      Re: Sic?

      Or a subtle joke about the dual meaning of the word "mar" (meaning "to inflict damage"), maybe?

      1. Anonymous Coward
        Anonymous Coward

        Re: "mar" (meaning "to inflict damage"),

        That brings new meaning to 'Mar a Largo', the home of 'The Former Guy"... The Damage that he has inflicted on the world in incalculable.

        1. Anonymous Coward
          Anonymous Coward

          Re: "mar" (meaning "to inflict damage"),

          Mar-a-Lago.

        2. Anonymous Coward
          Anonymous Coward

          Re: "mar" (meaning "to inflict damage"),

          No more than his successor.

  5. Ken Moorhouse Silver badge

    ARM

    There was an impact to the ulnar nerve which caused it.

    1. ecofeco Silver badge

      Re: ARM

      Golf clap.

      Well played.

  6. ecofeco Silver badge

    Well this would explain my problems yesterday

    Was trying to join some users to the system and kept getting weird error messages.

    I'll bet this was the cause.

    Stay classy, Microsoft. /s

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like