offering support to the airline "within hours" of the incident unfolding
And what's the cost of the disruption caused "within hours"?
CrowdStrike says it is "highly disappointed" and rejects the claims made by Delta and its lawyers that the vendor exhibited gross negligence in the events that led to the global IT outage a little over two weeks ago. That's according to a letter, seen by The Reg and sent to David Boies, partner at the law firm Delta hired to …
Yeah, this is not a good look for Crowdstrike. The perception I get from them is "we messed up, and that's your problem".
We don't know what "on-site help" they offered, or why Delta turned it down - maybe they were wanting to send someone without the proper security clearance into areas that they wouldn't legally be allowed to go? I can imagine that being a legitimate reason for turning down the "help" as not fit for purpose.
"The perception I get from them is "we messed up, and that's your problem"."
Hasn't that been Microsoft's moto since it's creation ? Ans amazingly they are still here. Crowdstrike probably doens't have quite the same funding though..
The article reads more like a lawyers wankfest than an actual resolution to a real world problem.
"Should Delta pursue this path, Delta will have to explain to the public, its shareholders, and ultimately a jury why CrowdStrike took responsibility for its actions – swiftly, transparently, and constructively – while Delta did not."
Er, nope.
How about: "Should Delta pursue this path, CrowdStrike will have to explain to the jury why CrowdStrike took no responsibility for its update and clearly didn't test it on even a single hardware platform before release, as evidenced by the diversity of hardware platforms that failed."
It's also hard to see how this isn't gross misconduct, sending something out without testing... and apparently a repeat of what they did for a couple of Linux distros only weeks before.
It also reminds me of when a garage bodged a service on my car leaving it in a dangerous state and then offered me a free service in compensation.... and then were surprised when I didn't ever want to let them near my car again!!
If and when it comes to a trial, I don't think the jury will particularly care that Crowdstrike caused the problem. Crowdstrike has ample material here to muddy the waters. Other customers recovered much faster than Delta did — jurors are unlikely to pay much attention to Delta's attempts to explain why. Delta refused Crowdstrike's on-site assistance — jurors are unlikely to pay much attention to Delta's explanation there either.
It's considerably easier to create uncertainty in a jury's evaluation of the evidence than to restore clarity. And uncertainty works in Crowdstrike's favor. If a jury decides Crowdstrike and Delta are both at fault, then Delta very likely lose the case.
And it's not like the citizens in the jury pool are fans of airlines, generally speaking. That's one of the most-disliked industry sectors in the US.
I think Delta's hoping that either they'll get a settlement out of Crowdstrike, or Crowdstrike will close up shop and they'll get a default judgement, putting them near the front of the line of creditors. And to some extent this is a PR move, though Southwest has demonstrated that airlines can get away with an extreme degree of incompetence.
Because vendors coming in to help out in an emergency like this are usually very little help and often a hindrance as they are unfamiliar with the companies operations and systems. I seriously doubt CrowdStrike's "help" was taken by very many.
Why did Delta take so long to get this resolved? Let's see, a very distributed network covering most of the world.
Oh, I don't think they'll ever admit to liability. They will try to demonstrate that they did what the law required of them which is to provides information and updates to mitigate the situation. If they can do this, they're probably home and dry and I'm sure they'll have a few "expert" witnesses from other airlines who will agree that it worked. Delta is going to have a hell of a job to show that its problems, which were much worse than other companies, were deliberately caused by Crowdstrike and that no mitigation was offered.
Though, personally, I'd love to see the lawyers on both sides forced to fight to the death over the matter, I don't see this being settled in court.
They just don't get it. Back in the early 80s when I was building equipment for telcos the units had to be made as error proof as possible because it was estimated that the down time for just one unit cost four hundred pounds a minute. Waiting "a few hours" for someone to turn up and fix things just wasn't an option -- the equipment ideally never broke but when it did it was designed to be fixed quickly by non-specialist personnel and fixed while the rest of the system continued operating.
Also given that the solution for servers you had access to the console was to plug a USB key in I can see many reasons why this was refused.
Many companies have very strict security round their data centres and taking a USB key is going to be verboten. Then plugging it in to multiple servers is a complete security & procedural no-no.
In the 80s and 90s Downtime in the $TELCO universe was usually measured in seconds per subscriber per year. And that was including downtimes caused by software updates...
Having something down for a few hours usually raised all ruckus that involved emergency services, fire deparments, police and was sure to make the first page of the local newspaper the next day.
But as you say, the systems were designed in such a way ( massive redundancy : almost everything duplicated, with one active side and a standby one ready to takeover in a matter of seconds )
"[...] Delta will have to explain to the public, its shareholders, and ultimately a jury why CrowdStrike took responsibility for its actions – swiftly, transparently, and constructively – while Delta did not."
Oh, oh, let me try! Ahem. "Dear public, Delta shareholders and members of the Jury. The reason why Delta did not 'take responsibility' for its actions is because it was not Delta's actions that caused the disaster. Delta _have not performed_ any actions that they need to take responsibility for. It was CrowdStrike's actions, and CrowdStrike's actions alone, which caused the damages to my client, and so it is CrowdStrike, and CrowdStrike alone, which must take responsibility for this disaster."
See? Easy!
See, it was Delta's fault because they installed the CrowdStrike software in the first place!
I remember listening to a presentation by an IT guy at one of the world's largest online travel reservation companies. Their allowance for downtime to do upgrades of the production systems was 15 minutes. *Per year*. And they knew exactly how much it would cost them per minute if the downtime exceeded that.
largest online travel reservation companies. Their allowance for downtime to do upgrades of the production systems was 15 minutes. *Per year*.
As a much younger person, I worked at one of these in North Wiltshire.. It was estimated that an hours outage on the mainframe had a notional cost of around $6m. And that was in the early 90s.
Our QA department had teeth and *anything* other than emergency fix [1] *had* to be signed off at every level by very experienced QA/programmers.
[1] Those had to be checked in an emergency meeting by senior QA and the CEO. Who knew that their jobs were at risk if the emergency update took down the system. Yes, those were the days when CEOs actually had to be accountable for their actions.
Honestly, it's like none of you people have ever served on a jury, or attended a jury trial, or read a transcript.
This ain't television, and it ain't school debate. Facts and reason have very little to do with any of it. That's not how this works.
(Techies are showing a lot of unsupported optimism around this incident. My money is still on Crowdstrike getting away with it. Stock price just zoomed back up again today, and is still well up year-on-year. Market says "oh, well!".)
I would be extraordinarily suprised if the contract between delta and crowdstrike doesn't limit liability. And generally in business to business contracts, limiting liability is enforcible. (and elsewhere it has been pointed out that crowdstrikes default contract says damages are limited to cost of services)
I would be very surprised if the contract actually specifies what is required re testing etc in sufficient detail not to end up with it being dealt with via existing contractual terms. Delta might have negotiated higher penalty clauses, but I doubt they are remotely close to the claimed losses.
Shareholders have a much stronger claim, with a big chunk being they don't have a contract defining the penalty if crowdstrike mess up...
I posted this before in another thread:
I had a family member that was a lawyer tell me that (in the US at least) "you can't sign away your right to sue for negligence". I was asking about the waivers like what are added to contracts for risky activities. He explained that you can ALWAYS sue for negligence. I would think in this case with Crowd Strike, it could certainly be argued that this was negligence.
Keep in mind that Delta mentioned "Gross Negligence". Those are the key words that override any contract terms. Delta's lawyers know what they are doing!
I had a family member that was a lawyer tell me that (in the US at least) "you can't sign away your right to sue for negligence"
You can't here in the UK either. In fact, a lot of the US-style contracts would be utterly unenforcable here because you can't void your legal rights here like you appear to be able to in the US. Doesn't stop the big corporates from trying though!
You can file suit for anything, assuming you haven't been declared a vexatious litigant or something. Suing is easy.
Whether you have any plausible chance of a verdict in your favor is what's at question here, and contractual limitations on liability, while certainly not watertight, are a significant barrier.
I believe it doesn't help Delta that under US UCC and case law, software has almost always been determined to be neither a "product" nor "goods", which limits which liability laws apply to it, and to a large extent means Delta's case falls back on contract law (not in its favor here) and laws around things like fraud and misrepresentation. (AIUI; IANAL.)
Look, while I have no love for Delta, I would very much like to see a precedent against ISVs (even though I work for one) that started to unravel the blank check we've given them for selling terrible software and disclaiming any responsibility. And I'd like to see Crowdstrike specifically held to the fire for what was by any evaluation a despicably poor QA and release process. But I really do not think this suit is going to achieve that. Hiring Bois looks like a sabre-rattling move: if Delta thought they had a decent case, they wouldn't need a high-profile lawyer.
There's a problem on both sides here.
1, vendor releases software without testing it appropriately - that's their problem and they need to address that
2, customer installs software and deploys it to production without testing it appropriately - that's their problem and they need to address that
Of all the people on this site, how many of you have policies that allow for untested software to be deployed into production? And even further, across the entire estate?
I was somewhat incredulous that so many organisations were impacted by this.
Their end-user license will be clear that they do not warrant the software to be bug free... there's a reason they say that.
"customer installs software and deploys it to production without testing it appropriately - that's their problem and they need to address that"
Your downvotes are probably from people who work at companies that DON'T pre stage updates before deployments
Or have dev, stage and prod enviroments
Or have documented change management processes, including roll back
Crowdstrike and Microsoft are responsible for this. But, Delta have to look inwardly at it's own change management processes
> There are too many server developers here. The software is not deployed on a server. It is deployed on an ordinary office PC used by an ordinary end user. There is no “dev, stage and production” environment.
Oh is obviously deployed on *at least some* servers in some organisations - as an example witness EMIS being affected by the CloudStrike problems so preventing UK GP Practices that make use of EMIS from accessing their registered patients' records, but not otherwise affecting those practices i.e. it did NOT affect the Practices' own Windows machines.
BTW EMIS' GP Records System runs on (servers in) Amazon Cloud...
More likely they're from people who have read enough of how the Crowdstrike system works to realise that Cloudstrike S/W downloads the files itself so that the OP's comment, and yours, are based on a false premise about staging systems and the like.
This was Crowdstrike's responsibility to create valid update files, Crowdstrike's responsibility for all forms of QA before release and Crowdstrike's responsibility to code their kernel-privileged S/W defensively because they were the only people in a position to do those things.
"2, customer installs software and deploys it to production without testing it appropriately - that's their problem and they need to address that"
That's the very reason I have no hair left, after 3 years running the system software deployment/testing section for a (now defunct) large DCS vendor whose systems were deployed on hazardous process plant. We learned (very early in the product's life cycle) that bug fixes and updated versions from the devs (in a.n.other country) could not be relied upon if we didn't want to risk things going bang, or spend all our time either on the road to sites which had crashed or in the Boss's office explaining why. We pretty much managed to stop bad stuff getting out. It has left me with an utter paranoia about updates, even now in retirement on what is now just two home W10 laptops, mine and SWMBO's. I still haven't solved the quandary of W10's end of life next year. Too many apps which aren't available on Linux (or Mac) and W11 looks to me like another W8.
"W11 looks to me like another W8."
You are being far too generous to W11 !!!
W8 was W7 with all the usability and logical functionality removed then a 'Fisher-Price'-esque design paradigm was lightly applied !!!
W11 takes W8 and progresses the 'Fisher-Price' design taking usability & functionality past Zero towards massively negative territory.
The usability is so bad now that 'AI' aka 'CoPilot' is required to allow queries to be used to find so called help in using W11.
Writing CoPilot for W11 is apparently 'easier' than fixing all the illogical design choices, random hiding of options in the GUI and missing 'help' that explains how to 'find' the actual real 'help' that explains how it all works !!!
:)
As I understand it, there was no way to stop this update from being installed. It was considered so important, that it bypassed settings that customers had in place to delay updates.
When a vendor pushes updates in the "background" that you have little ability to stop or delay, what are you supposed to do? In this case, choose a better vendor would have been an option, I guess?
Color me a conspiracy theorist, but, how does this happen from company that is supposed to be one of the leaders in enterprise security. That it happens on the Monday after the Sunday that Joe Biden <i̶s̶ ̶f̶o̶r̶c̶e̶d̶ ̶o̶u̶t̶ steps down from the US Presidential campaign. Effectively dominating the news cycle for the next few days. Remember, Crowd Strike has been in bed with the Democrat Party for years! They are the ones who failed to protect their email from being leaked in 2016!
Boring - how about:
In order to keep National Urban Bee Keeping Day out of the headlines in order to suppress US honey prices, the New Zealand’s Manuka Honey consortium, operating out of a Wework office in Roswell, contacted their customers in QAnon working out of the Woking Pizza Express and ordered them to unleash Crowdstrike so that honey imports would be disrupted and thus create Manuka honey scarcity in the US and increase demand and price still further.
Biden was forced to stand down to avoid a very sticky situation.
WRT 2 one problem here is crowdstrikes deployment system offered rules for deploying updates (so patch n-1, or A/B deployments, etc)
Crowdstrike deployed what was basically a AV definitions file rather than software update, which bypassed all the update deployment rules customers could set, while still causing a bootloop.
So regardless of how the customer had configured the deployment platform to do staged rollouts etc, this went everywhere it could immediately...
Indeed, and I do wish that more people would think about this rather than just crashing in with a generic ‘it’s the customer’s fault for not testing before deployment’! They literally can’t in this case.
This was not something like a Windows update, where you, certainly can, defer updates and test them on a subset of your devices, this wasn’t an update to the Falcon engine itself but rather a change to the ‘definition file’ if you like. Not something that you can possibly ‘stage’, well probably not without an extreme amount of messing about *. Now, in an ideal world, we all know that mistakes can happen, so Crowdstrike should have arranged things where their driver should have done some kind of ‘sanity check’ against the definition file and ignore it and use the previous one, if the new file was obvious garbage! It didn’t, blindly followed the instructions, fell over on its arse, and, due to the nature of Windows, took the OS with it!
As I understand it, the lawsuit is on the grounds that customers were led to believe, or had good reason to believe, that things were tested before general deployment, which would appear to be somewhat ‘not entirely true’!
* and to be honest, even if you could do this, imagine if the definition files are updated multiple times a day? How can anyone possibly test before deployment under that? IIRC doesn’t Windows Defender check every six hours for updated malware definition files and download and use them? You want to be protected ASAP if a new zero-day is identified, no? How about you get wiped out by a new exploit which was mitigated against last week, but you are three weeks behind in your ‘testing’, because well ‘Joe is on holiday (vacation), there was that office move, Jim caught COVID last week and hasn’t been in, and the C-suite all demanded new laptops this month and to all be set up as the highest priority’?
You still do not understand how Crowdstrike works.
This is pushed by Crowdstrike and the user have pretty much no control over when or where it is applied or even that it is happening.
The only option you have is to uninstall the agent BEFORE the update that you don't know has been released.
"Hello, our software just nuked your entire company through a botched update that we couldn't be arsed to test. Now we'd like to have our people put their greasy hands all over your machines, directly."
Am I the only person who would respond to this with "No, thanks, we've had enough of your 'help' already."?
Especially when the problem is an infrastructure one - we don't actually need the vendor to fix our crowdstrike installs. *We* have to go through all our end points in safe mode to fix your mess
If crowdstrike offered to help by punching bitlocker keys into corporate machines you'd have to supervise them the whole time anyway
There is quite a gulf between the common notion of "gross negligence"1 and the definition in US law.
Specifically, in the US, "gross negligence" applies specifically to a disregard to the safety or property of others — not to their convenience, or their business operations, or their reputation, etc. And it requires demonstrating an extreme negligence from "the ordinary standard of care", which, in the software industry, is basically "fuck you".
The standard for gross negligence in the US has four main aspects. It applies to behavior that affects the life or property of others; Delta can argue that their lost profits and other costs are a property harm, but that's a bit of a stretch. And it must be "willful", "wanton", and "reckless", all of which have their own specific definitions in US law. Willfulness looks like a particularly difficult obstacle for Delta to establish.
In short, what you think constitutes gross negligence doesn't matter in the slightest. What matters is what the law thinks is gross negligence, and I'm pretty sure the law won't agree with you here.
(Really, it's astonishing how many commentators on the Crowdstrike disaster just assume their feelings will be reflected by the courts. Have y'all not paid attention to the past half-century of the IT industry?)
1To the extent that there is a common notion of the term, that is. Certainly people like to throw it around, but ask a dozen and you'd probably get six conflicting definitions and as many admissions that they have no idea what they mean.
I'm the first one to bash MS any day. But going after Microsoft with the reason that the faulty software affected only Windows machines seems like a bit of a stretch to me.
That's like suing Apple if I buy a shoddy iPhone charger on Amazon from the well-known HZRYGWUL brand store and the charger catches on fire. "After all, my Android phone wasn't affected."
I don't know how corporate liability law actually works, but MS in this instance seems to have left a loaded gun on the table but Crowdstrike is the party that walked in, picked it up, and starting shooting several million pairs of feet. Other Windows driver vendors haven't had such public snafus (or at this that I can currently recall).
If by "a couple of years ago" you mean fourteen, yeah. And that didn't involve a Windows driver; McAfee deleted svchost.exe (the WinXP SP3 version) because it matched their signature for W32/wecorl.a. So actually that's not a good comparison at all, except that Kurtz was involved both times.
The McAfee screwup did end up sinking McAfee as an independent firm (it precipitated their acquisition by Intel, which, uh, was not a great move by Intel, perhaps). Of course then it was spun out again seven years later, then went public, then went private again...
Driver-caused BSoDs on Windows aren't really all that rare. Nvidia has been a culprit. So has Realtek. They're common enough that you can find any number of articles online about using Driver Verifier or other techniques to find the offending driver and roll it back.
That's not to say Crowdstrike's wasn't in a class of its own. They screwed up really badly, and they did so because their processes were terrible.
Their market cap is about $55 billion (and falling).
They had $6.6 billion in book assets in their most recent filings.
However, that includes the book value of a lot of contracts that have since been revalued to zero - or less.
One wonders what the actual realisable value of their assets might be.
So by how much would Delta’s loss have been reduced if they had accepted CrowdStrike’s generous offer? Was it one free pizza for every employee doing more than four hours overtime?
I suppose an airline has security rules, so would they have been legally allowed to accept any outside help?
CrowdStrike absolutely was responsible for the outage. However, Delta absolutely has responsibility for the slowness of their recovery - it took Delta over a week to do what every other airline managed in a few days. That shows some serious lack of planning and/or incompetence on Delta's part.
TBH I can see both sides here. Delta refused their "free" help.
But do you really want the vendor who royally screwed you to "fix" the problem? Perhaps not. I would imagine Delta (and many other companies) will be looking for a new vendor to supply these sorts of services.
CrowdStrike is probably going to be forced into bankruptcy by the litigation that's surely coming. And they might deserve it.