back to article Microsoft punches back at Delta Air Lines and its legal threats

Microsoft has labelled Delta Air Lines' accusations it's partly to blame for the outages caused by CrowdStrike’s buggy software "false" and "misleading" – and insulted the state of the carrier’s IT infrastructure. Delta, which has hired a law firm and threatened to sue Microsoft and CrowdStrike over the July 19 meltdown, …

  1. An_Old_Dog Silver badge
    Trollface

    "Upgraded"?

    Just wondering if MS' definition of "hasn't upgraded their IT infrastructure" is, "hasn't bought more MS products, and is still running green-screen apps via terminal emulation over SSH".

    If Delta was doing that, it would explain why their non-disaster-mode customer service was faster than their competitors'.

    (/me glances at computer room bookshelf, which contains a well-used copy of, "CICS on z/OS".)

    1. Anonymous Coward
      Anonymous Coward

      Re: "Upgraded"?

      The greenscreen stuff's connected to by terminal emulators today running on Windows. Im guessing it was mostly irrelevant to Delta's clusterfuck of a DR/BCP process.

    2. abend0c4 Silver badge

      Re: "Upgraded"?

      a well-used copy of, "CICS on z/OS"

      Employment for life, should you want it!

    3. Doctor Syntax Silver badge

      Re: "Upgraded"?

      For some of us "hasn't upgraded their IT infrastructure" would mean "still running Windows".

      As to not accepting help from Microsoft and/or Crowdstrike the two of them had provided instructions as to how to fix things. No doubt everyone at Delta who know how to do that was busily involved in doing it. Taking time off to show the cavalry around - once the cavalry had been security cleared - would just take them away from doing it. It's the classic Brookes thing of adding more hands makes things slower.

    4. Michael Wojcik Silver badge

      Re: "Upgraded"?

      CICS via SSH? More likely TN3270 (hopefully through TLS), or SNA LU2 (or LU3 for a 3270 printer), or one of SNA's predecessors such as BSC.

      But as someone remarked down below, even if Delta are using CICS, or IMS, or any other mainframe environment, or ssh to some character-mode UNIX or Linux app, or what have you, there's a good chance emulators were running on Windows. I suspect what Microsoft is alleging is more that Delta didn't have a good DR plan for their Windows systems, wherever those may have been and whatever they may have been doing. Which seems pretty likely to me, though I think calling that "upgrading" is an abuse of the term.

      And, yeah, still plenty of CICS in use. It's one of the things that puts money in my pocket, though I'm on the implementing side rather than the consuming.

      1. TheWeetabix Bronze badge

        Re: "Upgraded"?

        As i posted below, I happen to know that "system" contained a venerable 486.

  2. Potemkine! Silver badge

    Attack is the best form of defence isn't it?

    Anyway, lawyers are rubbing their hands.

    1. OhForF' Silver badge

      If CrowdStrike and/or Micros~1 are held liable demonstrating that the damage could have been mitigated by Delta accepting their help should reduce what they have to pay. Is Nadella already trying to limit the damage in case Delta prevails in court?

      I'd love a precedent showing software providers are liable for outages caused by their automatic patching without proper previous testing.

      1. Headley_Grange Silver badge

        "I'd love a precedent..."

        I think a precedent making OS suppliers liable for consequential damages would be better for Microsoft than for commercial users. For a start Microsoft might refuse to supply you unless your environment (dev processes, HW, FW, SW, interoperability, tools, processes, training, etc) was certified by MS as compatible with their SW. Ditto upgrades. It would be a magical money tree for them.

        1. Secon

          True - but they'd also need to ask all those companies who have moved their OOS legacy IT in a 'lift and shift' model on to Azure to vacate the premises...

          I think the bigger picture here is that Microsoft can't afford for a precedent to be created in Court where a Cloud Provider (either a Hyperscaler like Microsoft or someone sitting on the Hyperscaler and selling services like CrowdStrike) can be held liable for damages to a customer arising from a loss of service, which occurs for whatever reason.

          If a court awards any level of damages to a customer because of losses arising from a Cloud outage, Microsoft lose the 'pseudo-protection' of their heavily caveated Terms of Service, which make the customer responsible for their internal IT not being able to support their business when Microsoft's Cloud goes down, AND for the p*** poor decision to use their shonky cloud platform in the first place.

          As soon as a Cloud outage has been formally recognised by a Court as being the Cloud providers responsibility to compensate for Microsoft will face millions of claims every time their platform goes down.

          Given the frequency of those outages of late, they can't afford a court making that decision - so they attack Delta hard 'pour encourager les autres'...

  3. ComputerSays_noAbsolutelyNo Silver badge

    Is this normal?

    "Since 2016, Delta has invested billions of dollars in IT capital expenditures, in addition to the billions spent annually in IT operating costs."

    I am a small, simple bod, but I would like to know from some good folks with management experience. Is this ratio between cap-ex and op-ex normal?

    Billions since 2016, which is 8 years, so possibly a cap-ex of less than a billion per year; along with billions per year in op-ex.

    To me, the op-ex seems rather high; but what do I know ...

    1. MatthewSt Silver badge

      Re: Is this normal?

      Also, isn't measuring things by cost a bit like measuring planes by weight, software by lines of code etc?

      Yes you've spent billions, but has it all been on expensive contractors that have ripped you off, or are mates with the CEO?

      1. ComputerSays_noAbsolutelyNo Silver badge

        Re: Is this normal?

        No CEO among my mates, but publicly stated cost is basically the only measure available to outsiders.

        Assuming (I know, assumptions ...) Delta isn't stupidly overspending on available stuff and services, the stated cost shouldn't be too bad an estimate.

        1. Tilda Rice

          Re: Is this normal?

          3% is the median average for mid and large organizations. 3-5% is typical. Highly depends on your type or organization though.

          Delta from a quick search turn over 15.8bn. So 0.5bn would cut it, but they say "billions per year" - so probably isn't due to a lack of spending as they are way over the median.

      2. CrazyOldCatMan Silver badge

        Re: Is this normal?

        Yes you've spent billions, but has it all been on expensive contractors that have ripped you off

        When I last worked for a US company (20+ years ago!) capex was for buying tangible things like computers, network kit, leased lines etc etc.

        Opex was used for paying form the annoying squishy bits staff that used the stuff that capex bought.

        1. mmccul

          Re: Is this normal?

          Often, CapEx is purchased gear, OpEx can be used for leased gear and cloudy gear. Has to do with tax rules and depreciation from what I understand.

    2. yoganmahew

      Re: Is this normal?

      Airlines have a brutal load of IT-based legal requirements beyond the normal corporate, from maintenance to aircraft movement, to staff rostering. Each of these has bespoke or small supplier (often the commercial arms of other airlines) software on the latest next-big-thing mapping a history of IT back to the 1950s. Delta has previously been in trouble for running its loadsheet generation in a cupboard on the fifteenth floor of an office building.

      Why bespoke? Well, look at BA's SAP implementation to replace its parts system and the damage that caused. Off-the-shelf generic solutions either don't exist or are monstrous to implement.

      Why so slow to upgrade? Each of the systems is connected in fragile ways to the operation of the airline. Replacements for ancient windows servers have to work in pretty much exactly the same way and that's really quite expensive. Airlines go from broke to rich following the business cycle and back to broke again. IT re-engineering projects have very low priority and the landscape is full of sharky outsourcers long on promises, short on everything else. Meanwhile investment goes into NDC capabilities, offer-order, personalised offers, a vision that offers little to the consumer and even less to the airlines.

      Airlines are not Amazon Retailing, they're not Google search. Anyone selling you that is peddling snake-oil.

  4. Dan 55 Silver badge
    Meh

    "Delta, unlike its competitors, apparently has not modernized its IT infrastructure"

    - Boss, their licences show they haven't rolled out Windows 11 yet.

    - We've got 'em boys!

  5. Anonymous Coward
    Anonymous Coward

    Delta are incompetent

    Microsoft and Crowdstrike should be billing Delta for the Red team testing that proved their DR & BCP plans were shit.

    Whilst the Crowdstrike screw up wasn't in their control the speed and reliability of the recovery process 100% was.

    I work in the same industry and our core Win systems were back within hours and most of our end user kit back in a couple of days. If Delta weren't geared up to achieve the same - that's on them.

    1. Woodnag

      Re: Delta are incompetent

      The crux of the lawsuit is the "gross negligance" of Crowdstrike pushing out an **untested** patch which broke **every** system.

      Maybe Delta's IT org was crap, maybe they should have accepted help from MS whatever to recover faster.... but that is irrelevant to the gross negligance issue.

      MS is introducing a straw man to change the conversation away from the, to say it yet again, gross negligance issue.

      If you deliberately puncture my car tyre, offering a quick repair service doesn't change the fact that you deliberately punctured my car tyre.

      1. Michael Wojcik Silver badge

        Re: Delta are incompetent

        That does not fit the definition of "gross negligence" in US (Federal) law.

        1. TheWeetabix Bronze badge

          Re: Delta are incompetent

          Not sure if you noticed, but a couple of the affected machines were NOT inside the world-spanning Kingdom of Murika. There's room for plenty of lawsuits there, where corporations don't make their own laws.

    2. Michael Wojcik Silver badge

      Re: Delta are incompetent

      I think there's blame to share all around on this one. Crowdstrike arguably have the largest share, and IMO Delta are in second place, more to blame than Microsoft, though Microsoft's emphasis on adding shiny features no one wants and shoveling crap into their OS rather than fixing deficiencies means they certainly aren't blameless.

      But of course our armchair analyses have zero weight with the court, should this make it to trial.

  6. sjb2016reg

    I would guess I've flown on non-budget airlines 40 times over the last 30 years. So not a frequent flyer, but I've lived in a few different countries, and now live permanently in the UK but most of my family live in the US. If the airlines says ""Delta has a long track record of investing in safe, reliable and elevated service for our customers and employees," I know they're telling porkies, at least about elevated service for customers. Every time I fly, the experience in economy class is worse than the time before. Except maybe the screen in the seats which do generally give more choices than they used to (or there were none, back when I first started flying). With the exception of JetBlue, which was a refreshing change when I flew with them. But every other airline seems to provide RyanAir levels of service while charging Concord prices.

    1. Anonymous Coward Silver badge
      Facepalm

      But when you were flying, you were higher than when you weren't flying. Hence, elevated.

    2. Anonymous Coward
      Anonymous Coward

      My experience is very different

      Delta ground staff often leave a lot to be desired, but in the air their planes are generally clean, and the flight attendants excellent. Also their prices are often competitive. Of course if you're flying at peak times, well, they've got you by the balls and the prices will be higher. That's just algorithms.

  7. Anonymous Coward
    Anonymous Coward

    This may seem harsh, but I believe everyone involved in this story should die.

    1. Anonymous Coward
      Anonymous Coward

      Firm but fair, I like it.

      Of course, wait long enough and they will. Or did you mean in short order?

      1. Michael Wojcik Silver badge

        Eh, with any luck AI will wipe us all out in the next few years.

  8. gnasher729 Silver badge

    The unforgivable reasons for the outage:

    1. Cloud strike used software to read configuration files that would crash for configuration files that it didn’t like. That is absolutely unacceptable, and probably was the case long in the past and still today. Performing this operation at boot time made it obviously worse, but even without that it would have been an unforgivable problem.

    2. A crash at boot time is a permanent problem, not fixed by rebooting once or twice, but requiring costly manual intervention by a well-trained and trusted employee, which is why fixing it took so long and was so costly.

    1. Would have been prevented by better development practices.

    2. Would have been prevented by making ever update save the previous state, and allowing the blue screen of death to reboot with a known working configuration.

    So the downtime should have been: 1. Computers being rebooted end up on a blue screen of death. 2. IT gets a few, then many calls. 3. IT figures out that some button needs to be pressed to reboot successfully. 4. IT tells the first user with a working brain and waits for a successful reboot. 5. From then on IT tells callers “press button X on the BSOD and tell all your colleagues”. 6. IT walks around and does that for all users that were too afraid to press a button they didn’t know. So maybe an hour until many computers are back up, and three hours for everyone. Except people on holiday at the time, or people panicking and cutting their power cables, or department heads sending everyone home.

    1. TheWeetabix Bronze badge

      EXACTLY THIS.

  9. andrewj

    They should modernize everything - moving from Windows to Linux would be an excellent first step in that.

    1. Donn Bly

      Since Linux systems have also been borked by Falcon updates, that is more of a sideways step than a step forward.

      1. Michael Wojcik Silver badge

        Yes, but that one only applied to Linux systems running Falcon in kernel mode, rather than eBPF mode. eBPF has had its issues, but on the whole it's safer than running a lot of crap in kernel mode.

      2. TheWeetabix Bronze badge

        Yes, but who the hell would run something in kernel mode when they had another option?

  10. weirdbeardmt

    Did he check the junk mail

    I dunno… at a time when the whole world’s IT is on fire, if your email is even working, to get one purporting to be from the actual boss of Microsoft… phishing awareness training if it hasn’t gone straight to junk.

    Maybe next time go old school and… pick up the phone.

  11. Mark White

    Has no one heard of testing?

    I've worked in a few places where upgrades or patches were not allowed to be done on any prod system without at least a week or two testing on none impactful systems. Major security patches could be done quicker but were never applied to prod first.

    All changes had a defined rollback procedure, might not have worked in this instance with the machines not booting correctly but I'm sure there are a lot of sysadmins updating their policies incase this happens again.

    1. JoeCool Silver badge

      Re: Has no one heard of testing?

      You mean that thing other people, who don't know how to code, do so that their management has cover when production goes boom ?

      "I don't always test, but when I do test, I test in production"

      SARCASM. Obvs.

  12. TheWeetabix Bronze badge

    Up until a few years ago...

    Delta, at one of their US hubs, still used a 486 running the most antique software you've ever seen. Their AAA system went down at one point, which stopped access to this antique, which was running (of all the things) a flight planning and timing package that determined most or all of their daily routes. No access, no timing data, no flights. This would happen occasionally for intervals of between 15-20 minutes and most of an hour. You could LITERALLY watch planes that had not yet received an updated flight plan go into delay.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like