back to article Supply chain blamed amid claims of Azure capacity issues

Microsoft's Azure cloud is having difficulty providing enough capacity to meet demand, according to some customers, with certain regions said to refusing new subscriptions for services. Azure comprises over 200 datacenters globally spread across 60 regions, but reports suggest that over two dozen of these are operating with …

  1. Pascal

    The 80 weeks lead time for some network switch semiconductors is no joke.

    Some lower-end edge switches that we use (and you could bargain down to 1500-2000$ per switch pre-covid), we've had to buy refurbished / used for twice the price because we can't even get a 6-months delivery commitment.

    1. Anonymous Coward
      Anonymous Coward

      We're being quoted up to 9-month lead time on networking kit at the moment on tier-1 vendors.

      1. Anonymous Coward
        Anonymous Coward

        All those businesses hosted on Azure while operating near capacity... All it will take is one overworked MS engineer to make a single Powershell oopsie and it will be down for a week. No contact with clients, no payroll, no email, no backups, nothing.

        I sense if that did happen Azure would have a lot of abandoned space in short order.

        I've never had Microsoft be reliable when I needed it to be so.

        1. localzuk Silver badge

          If you are operating your business critical applications in a single region of Azure, without redundancy, that is a design flaw in your setup, not a flaw in Azure.

          1. Anonymous Coward
            Anonymous Coward

            "If you are operating your business critical applications in a single region of Azure"

            FTFY

            "If you are operating your business critical applications in any region of Azure, your a fucking idiot"

            1. Dave Null

              I do hope you're not an IT professional as that is the dumbest comment I've read in a while. I challenge you to operate at 99.99% availability

              1. James O'Shea

                hmm. Most of my servers are at 99.999% availability over the last year. The exceptions would be WinServers, they have to go down every ever so often because MS insists on a reboot to install certain patches. Non-MS servers stay up all year. One non-MS server has been out of service 9 minutes in the last 18 months. I'm fairly sure that even that one is over five nines availability.

                Why would I want to _decrease_ my availability to please some unknown yahoo on Ye InterTubes?

              2. Anonymous Coward
                Anonymous Coward

                Well the servers / service I looked after didn't have any outages on prem for over 7 years.

                After migrating them to Azure (not my call), they didn't last 18 months until a major outage.

              3. Anonymous Coward
                Anonymous Coward

                It would appear you don't work in the telco sector where 5 9's is normal...

              4. Anonymous Coward
                Anonymous Coward

                I operated ISP and telco servers at 99.9999%

                So I laugh at your 99.99%

                (I also designed it to use 1/100th of the hardware Microsoft consultants reccomended, and taught clueless cisco engineers who had no clue how to do networking configs correctly on cisco gear).

                So I find your comment the dumbest fucking thing I've read this year, and there are a lot of fuckwits trying for it!.

            2. localzuk Silver badge

              Really? Many small businesses can't front the capital to buy enough kit to do it all themselves, they often don't have the staffing capability either. Even just having a small number of servers, the switch gear, routing gear, a room to host it (along with cooling) (or colo), critical warranty support, hardware tech specialists (either in house, or on contract) etc... and you're talking tens if not hundreds of thousands of pounds depending on what you need.

              Vs using Azure/AWS/$OtherBigCloudService, getting a supplier to build it out for you and basically leaving it running with MS dealing with all the infrastructure maintenance seamlessly?

              1. Anonymous Coward
                Anonymous Coward

                Small businesses shouldn't be needing to buy all that crap, they don't need it.

                You a sales wonk? selling shit they don't need?

                And once they have pissed all that money up the wall on cloud shit, they have no control, no understanding and are now tied to having more money sucked out for fuck all.

    2. Mayday
      Facepalm

      Same here

      I made a proposal (pre sales) for a customer which involved improving redundancy by putting in some dark fibre, adding a couple of interfaces and changing their OSPF topology. It would have worked nicely. It then got handed off to an engineer for implementation..

      Somewhere along the line, the engineers said to the customer, “hmm your routers might shit themselves with the extra load, you’ll need some new ASRs” Customer said ok. After a few months, I was brought back to sort out why things weren’t done yet. This is when I, and the bosses, found out that people had “promised” the customer new hardware.

      Turns out ASRs were end of sale, and going end of software support next year (be buggered if I’m putting something internet facing out there without software support) and the replacement hardware, which curiously enough is called Catalyst 8500, has a 9 month lead time. Customer had a drop dead date of 30th June.

      You can imagine the conversations that ensued after this one

    3. 43300 Silver badge

      I've had some Dell switches on order since late January. Several delivery estimates so far - currently October.

      Had to make do with some Mikrotik ones on a temporary basis as that's what we could get hold of at short notice.

  2. elsergiovolador Silver badge

    Selling point

    The cloud's main selling point is that you don't have to pay and maintain idle infrastructure (well, to an extent - there is no such thing as free lunch, so the cost of idle machines is split amongst thousands of customers). Now that there is no idle infrastructure it becomes sort of on prem that you don't have access to.

    1. Jellied Eel Silver badge

      Re: Selling point

      I think the Cloud's main selling point is obscuring reality. Clouds are just hosted servers, and you become entirely at the mercy of the cloud provider. That becomes apparent whenever there are outages, or incidents like this and the cloud's shortfalls are exposed.

      Kind of a problem for clients that have drunk the Cloud-Aid, based pretty much their entire IT strategy around the concept of 'elastic computing', and the elastic snaps. Before retiring, I'd been doing quite a lot of work to bring cloud clients back in-house as they'd been burned by outages before.

      There isn't really much that 'cloud' providers do that can't be done more reliably in-house. I've also been curious if insurers are changing their views (and premiums) as flaws in clouds become more apparent.

  3. mark l 2 Silver badge

    This is the problem with tying yourself in with one cloud provider, if you want to deploy more or grow instances and they are running at capacity, its not trivial to switch to another provider once you are locked in.

    1. Rob F

      It really depends on how flexibly you can implement your solution. There are some architectures I have designed and deployed that could agnostically move anywhere. Making networking decisions that allow easy transition of frontend and backend services is one thing, as well as keeping resources as stateless as possible.

      1. elsergiovolador Silver badge

        If you spend time on implementing cloud agnostic environment, it's probably worth spending a little bit extra and just use dedicated servers for magnitude lower price.

        Then you can rent out boxes anywhere and many providers have them available instantly.

        Shame that clients are unlikely going to pass those savings onto their workers.

      2. Gavin Park Weir
        Mushroom

        I don't think a multi cloud architecture will necessarily help this scenario. If one cloud reaches capacity, then all the users with multi cloud architecture will try to re-balance workloads to the next cloud, overwhelming it and so on. Automation will could this even worse similar to Black Friday.

        Users really need to keep some on prem capacity to run critical workloads, if this is sized correctly then you get high utilisation and a lower cost than cloud. At cruch time you can choose to keep running your critical apps.

        Of course you can ignore all of the above: the current round of security services, RAS, CDN, Teams, Contact Centre, telco core management, network managemnt tools etc etc etc are all being run in the main clouds (Azure, AWS and Google), so when the clouds evaporate (groan) all of the services will also stop functioning. IT apocalypse

  4. MatthewSt
    Coat

    Serverless

    They really are taking this "serverless" trend a bit too literally...

  5. razorfishsl

    This probably accounts for ALL the problems related to 365 accounts suddenly going

    "over quota" and not being able to download any more emails, even when 50% full.

    And why all of a sudden "new" folders appear and are taken into account for the overall storage,

    like Microsoft metadata files & folders.

    they must have figured that it is impossible for users to exactly calculate their storage usage, and are stripping it back by bulking it out.

    or staff suddenly loosing 100gb of storage in their 1 drive and being cut down to 50 putting hte account in the red, and MS taking a month to reinstate the storage with no compensation.

    1. Anonymous Coward
      Anonymous Coward

      you forgot office359 (it's never been 365) accounts having mail randomly arrive in some mail boxes, 1hr to 2 days after arriving in other mailboxes.

  6. CapeCarl

    VM Hoarding...

    Should I take a couple petrol cans down to the nearest mega-bit-barn and load up with some extra VMs? (just in case)

  7. Anonymous Coward
    Anonymous Coward

    No surprise seeing as as per the IDC Microsoft overtook Amazon in total cloud revenue last year.

  8. Anonymous Coward
    Anonymous Coward

    The problem with the cloud, is that you eventually run out of other people's computers....

    1. kat_bg

      Heh? I think they are running out of their own computers... Or cannot build new servers fast enough.

      1. Mike 137 Silver badge

        An inevitable outcome of concentration

        "Cloud" has effectively taken a load that was distributed among a large number of small data centres and server rooms and concentrated it in a few large ones. The result is inevitably a continuous race to expand to meet the demand. I'm not suggesting that cloud services are a bad idea (they can solve many problems), just that they have this inherent weakness (at least until the market saturates).

        1. Roland6 Silver badge

          Re: An inevitable outcome of concentration

          And with software providers such as Microsoft going all cloud and effectively ending all on-prem software, it would seem (created) demand for cloud is going to continue to outstrip supply for another few years...

  9. Surrey Veteran

    It makes sense now, because we use HPC a lot, the Azure Reps were insisting last week in using "Spot" Instances which is unchartered territory as its give you 30 seconds for stop the job gracefully before "eviction" .More funny was when I asked how it works with their own HPC product and they couldn't elaborate.

    What will be interesting, those who committed to the 3 years discount will be able to burn those hours?

    In general is incredible and is not our industry, look at the airlines selling tickets and then cancelling because of lack of staff, capitalism destroying itself, big corporations and the cult of middle management.

  10. midcapwarrior

    No good deed...

    "Microsoft is struggling to balance growing customer demand with the support it is giving to the Ukrainian government, and that it cannot easily expand capacity because of constraints on the supply of IT kit."

  11. Anonymous Coward
    Anonymous Coward

    The Cloud...

    Other peoples computers you have no control over.

  12. Kevin McMurtrie Silver badge

    Abuse contact

    Microsoft could find some systems to reallocate if they reactivated their abuse contact. Unfortunately, their public IP addresses might have limited value at this point.

  13. Pirate Dave Silver badge
    Pirate

    Yep

    That figures. I'm finally trying to figure out the Azure VM thing after avoiding it for a few years. Signed up for the "Free" trial of $200 credit today with no problem, but it took a good 10 minutes to find a region and server size that would work under the Free trial. Finally settled on like a B1S with 1 CPU and 1 GB of memory for $9/month out of their West region. That's enough to run Rocky Linux, but, eh, I probably won't be compiling a kernel from scratch anytime soon.

    The higher tiers of server sizes got expensive, quick. The step above mine is like $31/mo, then it steps up to the $50's/month before shooting into the stratosphere. I guess for folks wanting to run "business" stuff maybe it makes sense. But for an aging tech nerd who's considering running a single Apache website, it seems kind of expensive. I doubt I'll keep the VM after I run out of the "free" trial.

  14. Flywheel
    WTF?

    Microsoft is struggling to balance growing customer demand

    ... with the support it is giving to the Ukrainian government

    So if the war continues for another 10 years as some have predicted, how is that going to work out?!

  15. mwerneburg

    We're wrapping up a migration (mostly lift and shift, some refactoring) onto Azure and have had the capacity problem come up sporadically since 2020. With a bit investigation we determined that it's best to keep away from the most modern SKU for any VM, and of course to avoid the oldest. This leaves the latest-but-one as the target. Microsoft agreed with my understanding that the SKU's are tied to certain hardware purchases so if you're on the latest version they might not have acquired enough hardware. If you're on older SKU's they may already be retiring the hardware. We've set our quotas on cores, we're reserving capacity, and we're running DR gear hot "just in case". Microsoft is putting far more of this on us than I would have expected and my confidence isn't high.

    Microsoft is playing the "we're the victims of our success" card on the support as well - they're too busy to engage on anything about level 1 incident support. To try to work out the strategy above I dealt with no fewer than five organizational units within Microsoft and none wants to take ownership.

    Microsoft is definitely putting its own business ambitions ahead of its customers'.

    1. Anonymous Coward
      Anonymous Coward

      You make a really good point about DR - lots of companies are relying on automated redployments of infrastructure however, that capacity is probably not going to be there day to day, nevermind a failure event in the paired region.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like