back to article Microsoft's Azure goes TITSUP PLANET-WIDE AGAIN in cloud FAIL

Microsoft suffered a major outage on its cloud service Azure overnight. It went titsup just before 1am UK time, and services are only now slowly returning to normal. Microsoft confessed that multiple regions were affected by the "service interruption". Customers around the world using Azure storage, virtual machines, a …

  1. Steve Davies 3 Silver badge

    Strangely Poetic

    Oh, where have you been, my blue-eyed son?

    And where have you been my darling young one?

    I've stumbled on the side of twelve misty mountains

    I've walked and I've crawled on six crooked highways

    I've stepped in the middle of seven sad forests

    I've been out in front of a dozen dead oceans

    I've been ten thousand miles in the mouth of a graveyard

    And it's a hard, it's a hard, it's a hard, and it's a hard

    It's a hard rain's a-gonna fall.

    Kudoa Mr Dylan

    {MS Blue... get it}

    1. Bob Vistakin
      Facepalm

      Re: Strangely Poetic

      Odd .. it's not a leap year so I wonder what trivial little feature a watch you get free with cornflakes beats it on this time?

      1. bdam
        Unhappy

        Re: Strangely Poetic

        When the watch stops it's your fault for not winding it up. When azure stops it's your fault for relying on microsoft.

    2. chivo243 Silver badge
      Pint

      Re: Strangely Poetic

      I prefer Leon Russell's version, but have an upvote anyway!

  2. Anonymous Coward
    Anonymous Coward

    and this is the reason we keep telling the bean counters NOT to use cloud based services be it MS, Google, Apple or any other! they all go TITSUP

    1. TheWeddingPhotographer

      And instead you reccomend

      Being devils advocate here... What else do you recommend?

      For example you are setting up a reasonable size website that needs to be scalable to suit the business need, not break the bank and yet and can grow and contract resources to suit the demand?

      1. Anonymous Coward
        Anonymous Coward

        Re: And instead you reccomend

        Fair comment so I'll elaborate a bit. Why would we want to have our email hosted, host VM's and maybe AD and use say Office365 when there (for us) isn't much of a saving. If we had done that today we would have mainly been sitting around with our thumb's up our ass

        1. TheWeddingPhotographer

          Re: And instead you reccomend

          I see that, but my question was more pointy...

          One day last year, we had more traffic to a site than we did over the previous month. Cloud servers let us add more servers behind a loadbalancer, autoscale etc.

          If we paid to have that level of server capacity 24/7 365 we would make a loss. By adopting a hybrid approach,when it gets busy we can scale up and sell (instead of crash) and then scale back down again as traffic drops.

          Back in the day when a "dedicated server" was all you could get, you had a choice - pay for lot's of redundancy, or come to a grinding halt when the when server gets too busy. At least now there is a sensible middle ground.

          Not all applications are the same, and for corporate email, which is fairly steady Exchange Server on a fixed size box makes sense. It was the global "all cloud is bad" point I was challenging.

          1. Infernoz Bronze badge

            Re: And instead you reccomend

            That's fine provided you still have your own servers and use the cloud as complementary scalable resource, not all the resource, because all-in can be all-fail!

      2. Anonymous Coward
        Facepalm

        Re: And instead you reccomend

        … spelling lessons. RECOMMEND

      3. P. Lee

        Re: And instead you recommend

        The proper way would be load-balancing.

        Public-facing web systems are a little unique - most systems do not have very variable usage rates - but let's go with that.

        Assuming you are using virtualisation (and you trust it!) you spread the load thinly across lots of servers, then a spike in demand for one application hits lots of servers a little.

        One problem comes when you haven't got free software and you have to pay for all those instances. Another problem comes when you try to buy two huge stonking servers which have expensive engineering and little excess capacity. Go cheap and simple on the servers and get a couple of decent load-balancers. Hardware is relatively cheap, its proprietary licensing which normally kills this model. I'd rather invest in expensive *nix admins and extra hardware than software licenses.

        What you really want to do is to cloud burst. You run your normal maximum capacity in-house and use the cloud only when you get an unexpected spike.

        Apart from that, the standard disclaimer applies - you will pay for what you use one way or another. I'd keep mission-critical things in-house where I control the risk/reward trade-offs.

    2. Anonymous Coward
      Anonymous Coward

      Cloud or not it's the same everywhere

      So outages happen only in the cloud? Tell that to the thousands of Fasthosts customers who were down for 6/7 hours this week, or to my customer who's on-premise server went south at the weekend and is still not back to 100%.

    3. Just Enough
      WTF?

      All services can go TITSUP, whether be on the cloud, or your own little server in the cupboard. The suggestion that cloud services are alone in this, and therefore form a reason for not using them, is completely bizarre

    4. The Godfather

      You mean in-house stuff never fails?

      1. Destroy All Monsters Silver badge

        In-house stuff fails too, and then you don't get the skills to get it back up again, or worse. You know "database corrupted, where is the backup ZOMG it says 'Monday' on that tape, wait the amount of dust on this shelf says that..... NOOOOOO!"

        Clouds mean everyone gets burnt at the same time and the Internet must work and you also better have a local backup just in case.

        But in all cases, it's just choose your poison.

    5. Greg Fawcett

      Depends on the cloud. We've used Appengine for more and more of our business over five years, and it's been more reliable than our own servers for the last three.

  3. Anonymous Coward
    Anonymous Coward

    I worry that there's something wrong at a fundamental level with Azure.

    Lots of cloud and hosting providers have problems. Datacentres go down, kit breaks, cascading problems make medium sized issues harder and longer to fix than they should be.

    With good providers these things are rare, and they don't happen quite the same way twice. With bad (or just cheap) providers they do, but some clients find the cost/benefit maths worth it.

    But Azure has now twice had worldwide outages. That's not something I've ever seen at any other provider ever. It cuts off the very idea of building distributed applications on Azure at the knees. Why do that when the most common outages are global in scale?

    1. TheWeddingPhotographer

      Agree

      you are right, It really doesn't make sense. It ought to be resilient in it's nature. It feels like they have spread the chocolate very thinly on the cake

      1. This post has been deleted by its author

      2. Anonymous Coward
        Anonymous Coward

        Re: Agree

        I'm not even sure it's a 'thin' thing. I remember an AWS outage that affected multiple availability zones for some clients because when one went down there wasn't enough elastic beanstalk capacity to make the failover do what the clients had programmed it to. That's a 'thin' thing and one hopes they've fixed it (though with the race to cut margins ever finer I'd think capacity planning remains a fraught issue for IaaS providers and I wouldn't be surprised to see these things pop up again, but the nature of the fault doesn't impact the basic logic of building a redundant architecture (though it may impact the logic of using AWS for the failover).

        With Azure it seems to be different. I remember the first outage came up because an expired security certificate put the entire thing offline. That confirms for absolutely sure that at the time there were single points of failure (and stupid ones) for Azure globally with no real separation of systems. Quite how anyone let that happen is beyond me. If the new fault proves to have a single global root cause too then why should anyone trust that there aren't lots of others?

    2. yossarianuk

      something wrong at a fundamental level with Azure.

      You know who created it ?

      1. Destroy All Monsters Silver badge

        Re: something wrong at a fundamental level with Azure.

        The Devil?

      2. channel extended

        Re: something wrong at a fundamental level with Azure.

        Bob?

    3. Anonymous Coward
      Anonymous Coward

      "I worry that there's something wrong at a fundamental level with Azure."

      There is. It runs on Windows.

      1. Phil_Evans

        And another, it's owned by Microsoft. The company with enough Cnutish (for it is he) bullshit hubris to throw the stuff together 'as a service'. In case there's any confusion, 'service' can be quantified as follows:

        1) Mobile telecoms " fuck-em! Oh and get them to pay, too"

        2) Banking "fu...etc"

        3) Software...ad nauseum, etc.

        'Service' means 'that which you will accept for payment', not 'value for money'

        Bring back OS/2

      2. Anonymous Coward
        Anonymous Coward

        "There is. It runs on Windows."

        No, Azure runs on a version of Hyper-V Server. Which is one of the best scaling and most secure Hypervisors on the market - and does not include or use Windows! It is also a free download from MS and is fully featured with no limitations..

        1. dannypoo

          So you are trying to suggest that all the software running the management of the cloud (which was the thing that went titsup) runs on something other than Windows?

          Of course there is a remote possibility that every single hypervisor on every single server across their global estate failed at the same time. However I'd be more inclined to blame the software infrastructure (which, being Microsoft, will almost certainly run on Windows).

  4. PghMike

    planet wide Azure outages

    I figure either they have some serious network routing problems affecting their private inter-city links, or they've designed something idiotically. Or both.

    It is surprising, however. After all, large intercity routers must misbehave all the time, and the currently deployed routing protocols must have some way of avoiding bad routers (even Byzantinely bad routers that claim to be working). So, it is somewhat of a mystery how MSFT manages to get these gigantic failures.

    1. Pascal Monett Silver badge

      It's MSFT. That's what they do. Outages. They've been sharpening their skills on the desktop for decades, now they're taking it to the next level.

      Soon the WILL be a Ctrl-Alt-Del for the Internet.

      As soon as Azure Active Directory has succeeded in replacing all other DNS solutions.

      Your distinctiveness will be mercilessly erased and replaced by our worldview...

  5. Slacker@work

    What outage?

    30k users located globally with someone in each time zone, full Exchange On-line,Office 365, SharePoint On-line, Yammer environment - not a single issue raised.

    Not noticed a bloody thing...

    1. yowl00

      Re: What outage?

      According to the Azure Status page it was just HDInsight that had a multiple region incident, so unless you are using that... Bit of a NOTW headline this.

  6. Elmer Phud

    It's the weather

    I'm sure we were told there would be a break in the cloud overnight.

  7. breakfast
    Mushroom

    Status: Working perfectly

    The important thing is that the Azure status page informs us that "Everything is running great" and as far as I can say that message hasn't changed at any point.

    It comes across as a touch insincere, even just a tiny bit like they are taking the piss, when that is followed by a long list of problems, failures and outages.

    Also I don't know how everyone else is getting on, but there still seem to be a lot of problems with it here.

  8. batfastad

    I wouldn't be surprised if they are actually using their own software to power their cloud. Would MS be that insane? A gigantic AD/Group Policy/DNS/Exchange infrastructure? What could go wrong!

    They should probably think about setting up isolated availability zones.

  9. Ramon Zarat

    Not even able to hit the 99.999 "Five Nines" annual uptime lowly standard on the 90s.... LMAO!

  10. SVV

    Seen the "Microsoft Cloud" adverts on the telly yet?

    They are playing them fairly frequently, and seem to show lots of grinning bearded hip young folk looking at screens containing code scrolling upwards at speeds far too fast to read.....

    Rather than IT managers receiving angry phone calls from senior executives informing them that if all their systems go down for hours again, losing their company a fortune, they'll be sacked immediately.

    Microsoft Cloud : it never rains, it's poor

  11. Anonymous Coward
    Anonymous Coward

    Or maybe they were patching themselves??

    :)

    http://www.theregister.co.uk/2014/11/18/youll_most_definitely_believe_what_microsoft_did_today/

  12. knott1701

    Our test environment VMs with many versions of the software we resell, not a twitch.

    Our Geo-replicated load balanced auto resizing production VMs? 10 hours of down time.

    Still wasn't long enough to workout how the Amazon AWS pricing model works.

  13. K

    LOL

    http://www.zdnet.com/2014-the-year-the-cloud-killed-the-datacenter-7000035676/

  14. Barnie

    When Novell was funny

    Things havent reallly changed this sums it up quite well:

    Flying Boy Novell

  15. Levente Szileszky

    I'm saying this for years...

    ...that anyone is willing to put his critical services on MS-run cloud services must be an adrenaline-junkie.

    I think there's something fundamentally wrong in MS' system design that they keep suffering from 'PLANET-WIDE' outages, year in, year out, for years now. Doesn't matter what is the bug, how it cascades into any bigger clusterfuck, it should NEVER be able to bring their whole cloud to its knees. I've never seen anything like that happening to our Google- or Rackspace-based services (and as far as I recall AWS also never had anything beyond region-level disruption.).

    1. kingwahwah

      Re: I'm saying this for years...

      Agreed. Fundamental design problem with Azure...they have to use MS products (...designed years ago). They also outsource the running of it I believe.

      Knock me if you will but we are moving to Google Business Apps. They've built from the ground up and avoided proprietary. www.google.com is relaible and powerful. They looked at Cisco/HP and thought why do we need all that functionality, licence complications and cost so they built their own switching firmware based on white box hardware (*cough* shipped by the pallet direct from China).

      I switched to gmail from outlook a year ago. I look at Outlook in horror now when I passing a colleague. I admit Google's UI designs can be odd and I worry someone with a beard will make more silly.. sorry cool ...decisions but overall I can always move on. Not risking much a £33pa per user. Switch no switch off as Mr Me-agi says.

  16. Mikel

    Cloud down

    Fog.

    1. Destroy All Monsters Silver badge

      Re: Cloud down

      More like "Bog"

    2. breakfast

      Re: Cloud down

      A truly azure sky shows no trace of any cloud.

      Apparently the same is true of a truly Azure datacentre.

      1. chivo243 Silver badge
        Coat

        Re: Cloud down

        so, no clouds on a sunny day? Pray for rain it is!

        1. breakfast

          Re: Cloud down

          Just remembered the perfect phrase for an Azure Outage that I saw from El Reg last time this happened:

          Blue Sky Of Death.

  17. Anonymous Coward
    Anonymous Coward

    Why would any business want to use Azure anyway?

    I've seen some Microsoft ads on the telly (where is mis-pronounce Azure) and bang on about it being something to do with some xbox game.

  18. vee Hybrid

    VMware Cloud = Superior

    Oh dear, not again. Bean counters go cheap.

    Pay more, go VMware and survive.

  19. tempemeaty

    It seems Azure didn't pass out from holding it's breath.

    It was smooshed by an update. A performance update to be more precise. Well at least it's back up again.

    1. Pascal Monett Silver badge
      Coat

      Up again, and with increased performance then ?

      So it'll crash in six weeks instead of three months now ?

  20. PaulusTheGrey

    Whether the weather is to blame for the storm or not.....

    Any meteorologist will tell your that clouds evaporate.

  21. Alistair
    Coat

    if the cloud is down.

    Isn't it kinda foggy around here?

  22. Stuart Castle Silver badge
    Thumb Up

    Loving he sponsored link at the bottom..

    " Featured Webinar: Active Directory integration and extending on-premises identities to the cloud", at the bottom of an article about a lot of people losing Active Directory access due to their cloud provider going tits up.. Well done El Reg!

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like