back to article Life, interrupted: How CrowdStrike's patch failure is messing up the world

Today is one of those days that will go down in history as an unmitigated IT disaster, with CrowdStrike responsible for taking systems down all over the globe. We know airports, hospitals and the usual critical infrastructure suspects have been affected, but CrowdStrike is disrupting daily life in some unexpected ways, too. …

  1. chuckufarley Silver badge

    What's worse than...

    ...Being of one of the most trusted companies in IT?

    Being a single point of failure.

    Luckily for me I don't have to fix any of it. Then again, this might be a good time to do some moon lighting over the weekend. I bet there will be plenty of places desperate for extra boots on the ground.

    1. cyberdemon Silver badge
      Happy

      Re: What's worse than...

      We might get that extra Bank Holiday after all

      1. alisonken1

        Re: What's worse than...

        Unfortunately, the bank holiday also affects your plastic too - so unless you have cash, be prepared.

        (kinda sorta /s maybe???)

    2. John Brown (no body) Silver badge

      Re: What's worse than...

      "Being a single point of failure."

      And there are so many "single points of failure" in so many places. Because so many companies are infected with "growth at all cost" and so instead of competing, buy up the competition, reducing the plurality of the ecosystem. In the specific case of Crowdstrike, there's also ESET and I think that's about it. It's not just the AV world and it's not just the IT sphere. In so many industries, if a big company can't or won't compete on quality, they'll do via marketing and buy-outs. In IT in paricular, we've all seen minnows swallowed by giants such as Google, MS, Apple etc only to disappear from the market and *maybe* their tech get folded into one or more of the giants products. Let's not rake over the Lync, Teams, Skype, Teams-for-Business history :-)

  2. b0llchit Silver badge
    FAIL

    Impact...

    So, how has CrowdStrike had a non-IT impact on your life today?

    Personally:

    • The total non-IT impact: 0 (zero)
    • The total IT-impact: 0 (zero)

    I've been watching this slow moving disaster and can only say that it is a perfect example how not to design (any) infrastructure you want to rely on.

    But, I'm rather convinced that, after the shock has worn off, the beancounters all over the world will prevent any real improvement from happening.

    1. Pascal Monett Silver badge

      Re: Impact...

      Unfortunately, I am inclined to agree with you.

      After all, there are those pesky contract clauses, etc etc, and finding another AV supplier is going to be a major pain and probably require humongous amounts of work from IT personnel everywhere, so why not just listen to the siren call of "lessons have been learned" and stay the course ?

      After all, it works so well for Capita in the UK (and Fujitsu, and God knows how many more) . . .

    2. A. Coatsworth Silver badge
      Trollface

      Re: Impact...

      The total non-IT impact: 0 (zero)

      I'd ammend that count to say that seeing this sh!tshow unfold (as a mere espectator) has turned this into one of the most interesting, and frankly amusing, Fridays in recent history.

      My thoughts are with the poor support troops that have to deal with the fixes, but tibdis like this fill my heart with warmth and childlike joy:

      >> Mercedes' F1 team, of whom CrowdStrike is a major sponsor, has reported that their systems have been disrupted

      1. MiguelC Silver badge

        Re: Impact...

        This morning I tried to buy something from FNAC.... neither their site was working nor their brick and mortar presence was open, although all seems to have recovered by now.

    3. Vometia has insomnia. Again. Silver badge

      Re: Impact...

      Lucky you. I've run out of essential medication so will have to go without. Though given the way the NHS is run, there's a problem with it every couple of weeks without requiring a worldwide IT crash to blame it on.

      1. Fruit and Nutcase Silver badge
        Thumb Down

        Re: Impact...

        Guess your GP practice is using EMIS. According to Wikipedia...

        It claims that more than half of GP practices across the UK use EMIS Health software and holds number one or two market positions in its main markets.

        The other approved GP systems are SystmOne, Microtest Health and Vision. In England EMIS and SystmOne have a duopoly. The pair were paid £77 million for primary care software in 2018.

        And owned by an affiliate of America's UnitedHealth. Cue very healthy profits.

        1. Anonymous Coward
          Anonymous Coward

          Re: Impact...

          Don't worry, Starnerite outsourcing will work so much better than previous outsourcing.

          Same shit, different label.

          Who let Gordon Brown out anyway?

    4. Anonymous Coward
      Anonymous Coward

      Re: Impact...

      It isn't like we haven't seen this before, just not on this scale. Almost every AV vendor has at some point released a borked update. The one that I can recall that had the biggest impact was Sophos a few years ago that broke updaters for products such as Flash, Java, Adobe and ... Sophos! In many cases it needed individual attention to every endpoint to fix. The CEO had to issue a grovelling apology.

      Yet here we are again a few years later. Same issue, different vendor, much bigger impact.

      1. Anonymous Coward
        Anonymous Coward

        Re: Impact...

        Still, breaking an update Is far less serious than borking a whole system.

        I want a government inquiry into how some third party American company has the ability to hobble NHS services, and UK airports.

        Whoever is responsible (and I don't mean CloudStrike, I mean the people responsible for these UK services) needs to be jailed.

        1. Doctor Syntax Silver badge

          Re: Impact...

          "I want a government inquiry into how some third party American company has the ability to hobble NHS services, and UK airports."

          But we know that.

          1. They all depend on computers

          2. Windows has become the standard operating system because nobody ever got sacked for buying Windows in the same way that "nobody ever got fired for buying IBM".

          3. Windows has a virus problem

          4 Crowdstrike is one of very few AV products being bought by corporates (probably similar reasoning to 2.

          5. Windows, Crowdstrike and any other products which are operationally essential and have a virtual monopoly become a single point of failure

          Now where, in that chain, are you going to find any specific individuals you can finger as being responsible as being culpable for buying industry standard products.

          Yes, it's a bad situation but what is needed from such an enquiry isn't scapegoating, it's a recommended policy to be acted upon (the second half of that is usually the sticking point) to escape from the monocultures.

    5. alisonken1

      Re: Impact...

      Work machines I manage related impact: zero

      Our business SaaS provider impact: 100

      Unfortunately, we just switch from on-premise to SaaS the beginning of the year. Just in time for this. The only saving grace is it only affected the back office people (about 30), not the main group (teaching professors and students).

    6. John Smith 19 Gold badge
      Unhappy

      beancounters all over the world will prevent any real improvement from happening.

      Sad.

      But true.

    7. navarac Silver badge

      Re: Impact...

      It impacted me by 5 minutes as I had to walk into the kiosk to pay for petrol. The "pay at pump" function was down. At least it didn't take an hour to "fill" the battery with power as well beforehand.

    8. Anonymous Coward
      Anonymous Coward

      What did you want to work today?

      Some commentards "I'm all right jack" attitude may indicate that you are deluded or that you have no interaction with "modern" society.

      i

      "the beancounters all over the world will prevent any real improvement from happening."

      I also saw the coming Microsoft Collapsing Systems Experience. (though actually MS are not the only culprit, but they set the benchmark for others to match).

      I am no longer directly involved with the "IT" world. But I can't avoid the inevitable consequences of poor engineering and security, proprietary systems based on implementations rather than standards, whereas what really makes systems work well is interoperable systems working with open standard which can be used to produce cost effective robust resilient systems, without unnecessary lockin.

      Unfortuantely until individual beancounters and their bosses (hi, Boeing and far too many others) are individually motivated to individually face the consequences of their individual actions (corporations don't make decisions, people do) then the Outlook is poor.

      Repeat business with demonstrably incompetent outsourcers isn't a given. Often it's mostly a UK thing. Sadly its not just in the IT sector.

      Right now, two important pieces of my personal life are being f***ed up by these ignoramuses: One has accepted things went wrong, one is trying to blame me:

      1) Ryanair: a friend flew to france, to a Ryanair-only region. Then Ryanair systems fell over and "bookings and checkins online" aren't working too well. according to Ryanair. Whch is a shame, as is Ryanairs horrible monoculture dependence on Boeing. Hopefully ithis idiocy will end soon.

      2) An estate agent is having trouble receiving documents from me, relating to a property purchase Elsxewhere, the legal people and the finance provider have no problem. I don't know for sure but I do wonder if their IT department is ridiculously MS-dependent and the office people haven't a clue, given the corporate competence and the list of packages Crowdstrike has infested.

      CNBC have what look like some well-informed non-geeky articles.

      https://www.cnbc.com/2024/07/19/microsoft-crowdstrike-shares-fall-after-major-it-outage.html

      1. Doctor Syntax Silver badge

        Re: What did you want to work today?

        In regard to 1): didn't Ryanair offering a manual check-in option for an extra fee? And assuming they couldn't take card, the fee could be paid in cash, for an extra fee?

    9. Erik Beall

      Re: Impact...

      I've been stuck in an airport with my two kids for three solid days, still hoping our flight this evening and connection gets us back home after seeing family. There were hundreds of people on those thin camping mattresses all over the airport at noon on Saturday and Sunday (when we came back to the airport for our last two attempts to fly home), including one couple with what looked like a six month old using a pair of mattresses. Let me tell you, the impact most definitely was nonzero. We were lucky to have family in town. Hotel rooms and cars were booked solid. It's not been a fun extra vacation, although at least I brought my laptop so I could keep working on-call when needed.

  3. Alan Brown Silver badge

    WTF?

    "the elevator in their building wasn't working due to the CrowdStrike patchpocalypse. "

    People do know that Windows isn't certified for safety of life applications, don't they?

    1. gnasher729 Silver badge

      Re: WTF?

      An elevator that doesn’t work is safe. It’s pretty inconvenient but safe. I’d expect an elevator to go to the next floor and open its doors, whatever happens on the outside.

      1. Doctor Syntax Silver badge

        Re: WTF?

        I’d expect hope an elevator to would go to the next floor and open its doors, whatever happens on the outside.

        FTFY

        1. The Oncoming Scorn Silver badge
          Thumb Up

          Re: WTF?

          As long as it doesn't try moving sideways.

        2. David 132 Silver badge
          Happy

          Re: WTF?

          The ones with AI (and Genuine People Personalities) saw this coming and are hiding in the basement.

          1. 502 bad gateway

            Re: WTF?

            Your local Syrius Cybernetics Corp representative will be in touch to discuss your needs.

            :)

            1. Martin J Hooper

              Re: WTF?

              I was hoping someone would make a HHGTG reference re lifts and wasn't disappointed!

          2. Doctor Syntax Silver badge

            Re: WTF?

            "hiding in the basement."

            The one with the sewer release valve in it or the one with the killer robot?

      2. Anonymous Coward
        Anonymous Coward

        Re: WTF?

        What do you call a broken escalator? Stairs.

        On a more serious note... why is the Windows startup so frail? Has Microsoft given any thought into auto-booting in safe mode whenever a BSOD happens repeatedly? Or in any manner shape of form respond with something less than the full-vapors/fainting-coach approach?

        1. david 12 Silver badge

          Re: WTF?

          Has Microsoft given any thought into auto-booting in safe mode

          When the anti-virus is compromised or crashed, just proceed on and ignore it? Yes, I wish I lived in a nicer world.

          1. Doctor Syntax Silver badge

            Re: WTF?

            What part of "safe" did you overlook?

            1. david 12 Silver badge

              Re: WTF?

              MS explicitly provides a method of booting with a failed driver: CrowdStrike explicitly flagged their driver as required for start (the "Boot Start" flag).

              "Required for start" is used for things like your motherboard driver, and things like your anti-malware protection. Things where there is no possible "safe" mode without the driver.

          2. gnasher729 Silver badge

            Re: WTF?

            What you want is some set of files that are signed by the manufacturer of the OS and can only be replaced with different code signed by the manufacturer. With the ability of having a boot process that stops before touching anything else. So if you have a system crash, and then reboot crashes, and then it crashes again, you do a limited boot using files by the manufacturer only.

            And because of paranoia and because the manufacturer can make mistakes, you keep half a dozen versions around, and if there is another crash, you go to an earlier version. And try to download an update that explicitly states it fixes a crash in the latest version you tried.

            Should be possible to end up automatically with _something_ that boots up correctly.

        2. Anonymous Coward
          Anonymous Coward

          Re: WTF?

          > Has Microsoft given any thought into auto-booting in safe mode whenever a BSOD happens repeatedly?

          Is "crowdstrike" your safe word? ;-)

      3. Jonathan Richards 1

        Re: WTF?

        Have you considered the advantages that down might offer?

    2. Pascal Monett Silver badge

      Re: WTF?

      People might.

      Companies don't.

      1. Anonymous Coward
        Anonymous Coward

        Re: WTF?

        "People might.

        Companies don't."

        With the greatest possible respect, companies don't make decisions, individuals do. Fix that disconnect, and *many* issues will magically disappear.

        1. Doctor Syntax Silver badge

          Re: WTF?

          With the ultimately greatest respect imaginable, people often make decisions collectively, not individually. A collection of individuals is called a company.

          What's more, they may make them by reasoning to meet widely (i.e. more than just the company) accepted criteria. The reasoning might be impeccable. The criteria, however widely accepted may be wrong.

          1. Anonymous Coward
            Anonymous Coward

            Re: WTF?

            "A collection of individuals is called a company."

            Sometimes. Sometimes it's called a hockey team. Etc.

            The rot inevitably sets in when the indviiduals in the collection of individuals start to call themselves a limited company and want to be indivially paid lots because they're individialy carrying the risk of being in charge. But then they also want their individual liability to be limited.

            Unlimited pay for individuals surely needs to be balanced by unlimited liability for individuals.

    3. That Badger

      Re: WTF?

      The elevator display probably "needs" to show advertisements...

      1. This post has been deleted by its author

      2. Anonymous Coward
        Anonymous Coward

        Re: WTF?

        "The elevator display probably "needs" to show advertisements..."

        And the lift doesn't work if you're using an adblocker.

    4. martinusher Silver badge

      Re: WTF?

      But it probably provides the reel of advertisements that many display on their screens along with floor information. No reel, no adverts so got to disable the elevator.

      Its really the same with everything else. Life works fine but once the financial system gets gummed up then everything else is supposed to stop dead because as we all know, the only essential component of life as we know it is finance.

      1. cheb

        Re: WTF?

        This makes me think of the waiting spaceship on Frogstar World B.

      2. Anonymous Coward
        Anonymous Coward

        Re: WTF?

        There are video adverts in lifts now? ARGGGGHHHH

    5. Mark 85

      Re: WTF?

      Exactly, I think a lot of people will be asking the same question or a variation of "Why does and elevator need a computer control?". Besides profit for someone of course? If the computer is just running ads inside the elevator, that's one thing, but control?????

      1. cyberdemon Silver badge

        Re: WTF?

        Oh yes. If you thought "Windows Server" was an oxymoron, check out "Windows Embedded".

        Used by many industrial control systems and even PLCs ("Programmable Logic Controllers", which used to use very basic operating systems, but the likes of Beckhoff have gone for Windows Embedded) across the globe

        1. Doctor Syntax Silver badge

          Re: WTF?

          If it's embedded then embed it very thoroughly. Nobody gets near it to install viruses so no AV, no AV updates.

        2. TimMaher Silver badge
          Windows

          Re: Embedded

          I remember ATMs using OS2. Agghhh!!!!

      2. PRR Silver badge

        Re: WTF?

        > "Why does and elevator need a computer control?"

        They have always had lots of logic, to monitor safety switches and store and sequence calls.

        Where I worked had a 1969 vintage THREE floor elevator. Down the basement was a full relay rack sporting several dozen electromechanical relays. You could trace the activity by the clicks and clacks. It was damp down there, so every month someone had to power-cycle out of a stuck-loop, and every few years we had to wait for the repair guy to find and replace a bad relay. (Yes, by listening to the clicks.) It was of course wired so any inconsistent state stopped all action. Chopped toes can be very expensive.

      3. John Brown (no body) Silver badge

        Re: WTF?

        "If the computer is just running ads inside the elevator, that's one thing, but control?????"

        Queue management to minimise waiting time, especially in taller buildings. Even the 10 story building I sometimes have to work in has a lobby with four lifts and even at the very basic level, will call the nearest lift to you based on various parameters. eg if A is going up to floor 3 only and B, C and D are all currently on 10, then A is probably closest in time and space. They will also "settle" on different floors when all are idle, depending on the time of day. eg in the morning, empty lifts will invariably return to ground, at lunchtime or end of day, they'll often wait around the middle, plus or minus a floor or two, with weighting given to the floors most often called for, ie those with more people on them. They also use this to tout green credentials by claiming the lifts use less energy by being more efficient in their routing, although I personally doubt that's significant if it even exists.

        1. Anonymous Coward
          Anonymous Coward

          Re: WTF?

          I work in the City , 22 Bishopsgate.

          I'm on a lower floor, building has 63.

          There are several banks of lifts, and once you scan in through the barriers (bluetooth from your mobile), request your floor from one of several tablet size screens, you get directed to a lift that will serve your (and some other) floors.

          It occurred to me that anywhere in the comms to get to my floor, CS could potentially bugger it up.

          Messaging Services? Possibly? Just the simple 'select your floor'? Just getting through security? The lift logic?

        2. Doctor Syntax Silver badge

          Re: WTF?

          "Queue management to minimise waiting time, especially in taller buildings."

          It still doesn't require a regular connection to the outside world. Even if it occasionally needs an external connection for servicing then just connect externally. And don't use a desktop operating system with all the tranklements that come with a desktop operating system.

      4. gnasher729 Silver badge

        Re: WTF?

        “Why does an elevator need a computer control?"”

        A single elevator just received a command to which floor to move, it does that, opens the door, and doesn’t close it as long as the door sensors find an obstacle. Then it closes the door and waits for the next command. You could probably do all that in hardware, but a very cheap processor is probably cheaper.

        If you have half a dozen lifts, not all starting and ending at the same place, enough traffic that you want maximum efficiency, you would want a computer sending commands to all of them. Now that computer has little reason to accept input from the outside.

    6. Mark 85

      Re: WTF?

      I do have to wonder how many C-Suite types will get on an elevator that will shoot them straight down to the sub-basement which is now filled with water because the sump pump computer controls are dead??? Sound like a blessing for the BOFHs of the world.

    7. Rafael #872397
      Pint

      Re:"the elevator in their building wasn't working due to the CrowdStrike patchpocalypse. "

      Randall Munroe can explain.

      I canceled three meetings today because of CrowdStrike. None of them were computer-based, -mediated or even -related.

    8. Anonymous Coward
      Anonymous Coward

      Re: WTF?

      So... what about all the NHS problems we've had?

      Does that mean that the NHS tech people responsible for letting a third party company bork their systems are even more culpable than first thought?

    9. Anonymous Coward
      Anonymous Coward

      Re: WTF?

      ""the elevator in their building wasn't working due to the CrowdStrike patchpocalypse. "

      People do know that Windows isn't certified for safety of life applications, don't they?

      What people know, and what managers care about, are spelt differently.

      There are specific requirements in the EU and UK for passenger ift safety. Part of the LOLER stuff if I remember.

      Stuff typically needs to have regular independent inspection (in the case I know of it's every 6 months, non-firefighting lift updated in 2000 or so in a ten storey block of flats containing many at-risk people and no alternate lift).

      There is a UK/EU legal requirement for a functioning two ways comms system from the lift to outside, in case of emergency.

      In the lifts in question, it used to be provided by something called a "telephone" in the lift car, over POTS lines.

      In the interests of continuous product and service improvement, that "telephone" stopped working a few months ago. It may or may not be related to BT's disastrously planned and managed transition from copper to VoIP, which players involved should be well aware of.

      Who's responsible/accountable in this picture?

      The copper lines in the area still work for this kind of thing.

      The property management company know their (in) actions are criminal.

      What's a proverty--stricken lift techy supposed to do? Speak up and get sacked because others and doing their job right. Please don't suggest "report it to HSE".

      1. Doctor Syntax Silver badge

        Re: WTF?

        Presumably the independent inspection, assuming it's independent and an inspection, will pick up the phone problem in due course. If, as implied, report it in your own organisation with a paper trail to cover yourself when the inspection fails the lift. Alternatively report it to HSE and/or the fire service anonymously.

  4. John_Ericsson

    So where do we go from here?

    What is going to be the flavor of the day for next years auditors when they prod our resilience policies. What are they going to be looking for?

    1. Jamie Jones Silver badge

      Accountability.

      Doctors need years of training and certification. So do engineers. You need to pass a driving test to drive, but anyone can program critical systems (and not to put all fault on the programmer - there are processes and procedures that are not being followed).

      It's long time that IT products were treated with the seriousness they deserve.

      It has to be done by regulation. If you were getting a large bridge or building built, you wouldn't give the job to the cheapest bidder with no credentials.

      We need to stamp out cowboy programmers, cowboy designers, cowboy implementers, and as for "agile" - it should be illegal to use for anything other than computer games.

      If I was an evil state actor, I wouldn't be looking at attacking a companies defences, I'd be looking to infiltrate one of the many third party software providers they blindly trust to root their systems.

      Another thought... I bet some compromised systems are ones which need security clearance to access... Yet another complete ballsup.

      1. Anonymous Coward
        Anonymous Coward

        "It has to be done by regulation. If you were getting a large bridge or building built,"

        In the UK, it doesn't seem to matter how many ridiculous foulups get made, the same mistakes are repeated.

        " I'd be looking to infiltrate one of the many third party software providers they blindly trust to root their systems. "

        That's well known, isn't it often called a "supply chain" attack?

        There's a bit more to secuirty and robustness tnan many people seem willing to acknowledge.

        1. Jamie Jones Silver badge
          Thumb Up

          " I'd be looking to infiltrate one of the many third party software providers they blindly trust to root their systems. "

          That's well known, isn't it often called a "supply chain" attack?

          Yeah, sorry, I wasn't clear. Whilst they exist already, I think something like this event shows they are much more effective than (I assumed) people thought, and as they so easily got through defence systems, then it would be a "good" idea for the baddies to focus more effort on them.

          Before this, I'd never thought that hospital operations and plane flights could be cancelled by some unvetted, unrelated third party foreign companies actions.

  5. Jim Willsher

    Still amazes me that MS allows a third-party driver to simply pull the system down and stop it getting back up, with no soft landing to allow remediation (aside from safe boot etc). Surely Windows should still start but with a huge warning that "service X couldn't start".

    Windows is too fragile (no shit, Sherlock) such that it can be crippled by a third party component - although MS are pretty capable of doing that themselves.

    1. Mike 137 Silver badge

      'Surely Windows should still start but with a huge warning that "service X couldn't start" '

      On the other hand, suppose it started but then triggered some stupid memory error (buffer overrun, use after free &c.). As AV connects deep down to, or near, the bare metal such an error could well cause random execution at the lowest level of code. I've always favoured remote proxy services for malware protection, not least as these can't compromise your systems -- they either work or they disappear (when they can be sidestepped temporarily if necessary) and they are also at best much more powerful at the job (e.g. by performing live testing of suspect stuff in disposable VMs).

      1. Jamie Jones Silver badge

        "If machine in boot loop, boot to a safe mode which will allow remote-config access (to predefined credentials only)" seems to be far less destructive, yet still a much better option.

    2. Richard 12 Silver badge
      Mushroom

      It's kernel-mode

      Any operating system that supports kernel-mode modules can be taken down irretrievably by those modules.

      By definition, a kernel-mode module can do anything the kernel can. And thus it can also stop the kernel doing anything the kernel might be designed to do.

      This is why Microsoft have a special commercial route for companies wishing to create such modules, and Torvalds rules with a fist of iron.

      Neither approach is foolproof. Everyone really relies on proper testing, which CrowdStrike clearly could not be arsed to bother with.

      1. Jonathan Richards 1

        Re: It's kernel-mode

        > Everyone really relies on proper testing, which CrowdStrike clearly could not be arsed to bother with.

        This. I am old, and naive, so I don't believe that CrowdStrike could possibly have released this channel file[1] without some testing, and yet, according to a Microsoft estimate, 8.5 million machines may have been affected. Sounds like a lot; isn't a huge proportion of the total count of Windows boxes, so I suppose the disproportionate effects are because too many big businesses are big enough to need/want the protections that Crowdstrike offers.

        So, what makes one or a few percent of machines vulnerable to the update, when others are not? Differences in the kernel versions being used?

        [1]I have never met a 'channel file'. From what I can infer, it's something that configures a kernel module that runs in Ring 0. Sounds risky.

        1. Doctor Syntax Silver badge

          Re: It's kernel-mode

          I'm old and cynical so am (a) inclined to ask for evidence of that "only" 8.5 million and (b) inclined, on the basis of evidence of the fact that the file crashing their own S/W, that they released both the file and, previously, the S/W that uses it without adequate testing.

      2. jbruner

        Re: It's kernel-mode

        Yep, that's it exactly. Apple for their part deprecated kernel space drivers in 2019 and will stop signing them at the end of 2024 (IIRC). Folks can give Apple all the crap they want about these kinds of practices and bemoan "planned obsolescence" but they've managed to keep things moving forward without being weighed down by past baggage. In 2019, 32-bit also went away with macOS Catalina, which is admittedly sad for all the great 32 bit Steam games but that's what VMs are for and it also frees them from the 2038 date problem coming down the road for 32 bit systems. MS either can't help but be the "nice guy" and allows all manner of ancient cruft to remain or it's that they are bowing to clamor of customers who want infinite backward compatibility which is pretty untenable. I mean for goodness sake: How are drive letters still a thing?!

        https://support.apple.com/guide/deployment/system-and-kernel-extensions-in-macos-depa5fb8376f/web

        "macOS 10.15 or later enables developers to extend the capabilities of macOS by installing and managing system extensions that run in user space rather than at the kernel level. By running in user space, system extensions increase the stability and security of macOS. Even though kexts inherently have full access to the entire operating system, extensions running in user space are granted only the privileges necessary to perform their specified function."

        ...

        "Important: Kexts are no longer recommended for macOS. Kexts risk the integrity and reliability of the operating system. Users should prefer solutions that don’t require extending the kernel and use system extensions instead."

  6. Andy Non Silver badge

    Chaos at the Docs today

    My GP had problems issuing prescriptions today, finally got a "sort of" prescription to take to a local pharmacy, but they were unable to confirm its validity due to systems down. They ended up phoning docs to check. Turns out my normal online pharmacy has now also double issued it too, which I understand from the local pharmacist has left them technically breaking the law. All fun and games.

  7. gnasher729 Silver badge

    Staggered releases?

    I worked at a place writing software that could be quite essential for some countries. So we did releases timed at Monday, 7am in one country two hours ahead of us. The idea was that early birds and auto-updaters would get the changes at 9am their time, two hours later they had plenty of time to find serious problems and talk to support, and at 9am we had all the information and could start fixing things. Worldwide release was on the next day.

    I learned that once in a while you can call Apple and tell them you need an urgent review for something that is on the AppStore broken, and it can get reviewed and on the store within an hour of submitting a new version.

    1. Cheshire Cat

      Re: Staggered releases?

      The problem is that AV pattern updates are very very time-dependent, which is why they have 15min or lower update cycles.

      This issue was a pre-existing bug in the code which was tickled by a subsequent pattern update. While the code updates can be staggered, the pattern updates have to be quick, so cannot be delayed by a month, or even a day.

      This is why extensive pre-release automated testing should have been performed, and clearly was not (at least, not sufficiently)

      1. Doctor Syntax Silver badge

        Re: Staggered releases?

        It turns out that not only is time critical, not crashing the customers' computers is also critical. Who knew?

  8. naive

    If God would have been an IT guy

    We would have been extinct from plague, or any other pandemic preceding it, due to similar genes, offering a great attack surface for the plague microbes.

    Diversity in IT landscape costs a bit more, but like human genes, a single attack vector won't wipe it completely out in a short period of time.

    Maybe Resilience will become a factor in IT security as well, until now the scope of IT security is very much focused on "keeping bad actors out".

    1. Anonymous Coward
      Anonymous Coward

      Re: If God would have been an IT guy

      > until now the scope of IT security is very much focused on "keeping bad actors out".

      Well they haven't been doing a good job of it, Steven Segal films are still available on multiple streaming platforms! :-)

  9. Anonymous Coward
    Anonymous Coward

    SECURE NO BOOT

    Sorry boss. My car had an BSOD today.

  10. Anonymous Coward
    Anonymous Coward

    What the experts are saying

    “Such bad, many downtime.”

    -Doge, IT Security Spiritual Analyst

    “With the security cameras off-line, I was able to loot an entire pie… with sausage”

    -Pizza Rat, Survivalist NYC

    “Today’s forecast: Grumpy admins with coffee storms into the late evening.”

    -Grumpu Cat, IT Short Stock Analyst

  11. KittenHuffer Silver badge

    "In other words, it's going to be an interesting Friday."

    May you live in interesting times! - Chinese curse

  12. JRS
    Mushroom

    Major crisis averted - just

    Local Wetherspoons was cash only this morning.

    Was severely concerned I'd miss my weekly breakfast.

    Fortunately one of the team had enough for coffees, then an acquaintance lent us some cash.

    Normality resumed - apart from a small debt to someone we don't see very often.

    1. Anonymous Coward
      Anonymous Coward

      Re: Major crisis averted - just

      And you weren't carrying cash WHY?

      This cashless nonsense needs to die. ALWAYS carry enough cash to get through a normal day if your plastic stops working. Because IT WILL STOP WORKING.

      Don't bother trying to argue that either. Because it DID stop working.

      1. ITMA Silver badge
        Devil

        Re: Major crisis averted - just

        Or worse - those that rely on their phones for everything including making payments.

        The number of times I, along with others, have been stuck behind someone faffing about with their phone to try and make a contactless payment.

        1. teebie

          Re: Major crisis averted - just

          The phrase to use in that situation is "move it along, future boy"

          I don't understand why you would consolidate the two things that can get you home (money for transport, or a phone for cries for help) into one easily stealable item.

    2. Martin Howe

      Re: Major crisis averted - just

      Same in Cambridge; but it didn't matter, as I was NOT prepared to walk 100 yards to the bar in sweltering heat in an old cinema with no aircon during a heatwave to pay with card or cash unless I absolutely had to :) I just ordered and paid with the app from my seat 8 yards from the breeze coming in the front door :) Lucky PayPal wasn't down. Took a while because their Wi-Fi was patchy; but at the Regal, it always is - no CrowdStrike needed :)

      1. Anonymous Coward
        Anonymous Coward

        Re: Major crisis averted - just

        A movie theater without air conditioning? In 2024?

        They'd have closed by the 1950s here - it was common by the 1930s, people went to the movies during the depression to cool off.

  13. My other car WAS an IAV Stryker

    One way to avoid customers returning merchandise

    Wife tried to return something to big-box retailer JC Penney that her dad had bought (wrong size). Instead of opening [1] they put up a sign: "Due to technical issues this JCPenney location will be closed until further notice." I can only assume they got struck by CloudStrike; too many retailers gave up their bespoke (or Unix terminal-based) systems of the '80s and early '90s for Windows POS [2] machines, and this is what you get.

    1. She arrived at 10:30, assuming they opened at 10:00 like they have for years, but nope -- 11:00. They must be hurting for workers, customers, or both. They didn't post the sign until it got close to 11:00, naturally, after she had waited all that time.

    2. Point of sale, but the other phrase "piece of s___" works too.

  14. Curious

    What's the bet..

    Whats the likelihood that Crowdstrike will announce on Monday that in future they will use AI to prevent this from ever happening again.

    And the stock price will go 'oh, that's better.'

    1. RM Myers
      Unhappy

      Re: What's the bet..

      What's the likelihood that Crowdstrike had already started using AI to help with their patch process?

  15. Howard Sway Silver badge

    Today is one of those days that will go down in history as an unmitigated IT disaster

    Yet it was caused by a problem that has been known about and recurred for decades : letting world+dog just install and overwrite crucial files in C:\Windows\System32. This could have been prevented by developing a proper transactional file system for crucial OS files and drivers, that enabled easy rollbacks to a last known good state, and would have saved the world economy billions by making malware and dodgy software updates a simple problem to be recovered from. This is why Microsoft shares a great deal of the blame for what happened today.

    1. Ace2 Silver badge

      Re: Today is one of those days that will go down in history as an unmitigated IT disaster

      When you install a broken driver in ESXi, and then reboot, it pink-screens. Then when you reset it, it boots again, with the bad driver quarantined!

      The first time it did that to me when I was developing one, it was like Christmas. No boot loops! No safe mode!

  16. Anonymous Coward
    Anonymous Coward

    One of the 43. It's not going well.

    11,000 endpoints down. Thank Chutulu that I am not on any of the teams that have to deal with this.

  17. frankyunderwood123

    Microsoft to blame, surely?

    How are Microsoft seemingly getting off scott-free here?

    The BSOD issue is _obviously_ shonky coding - seriously it is.

    To take down an entire OS due to a 3rd party service failing is cowboy coding circus clown territory.

    "It's fine, CrowdStrike _never_ fail, we don't need no graceful failure checks, screw error handling"

    Perhaps I'm failing to understand the issue here, or perhaps the world has gone mad.

    If you cannot code defensively to ensure third party services don't take down your product when they fail, you have no place as a software engineer or a software company.

    1. frankyunderwood123

      Re: Microsoft to blame, surely?

      And a thumbs-down.

      Seriously?

      Having watched all of the media showing BSOD all over the world and you thumbs-down a post saying Microsoft are to blame?

      Of course they are, an entire Operating system tanked because a souped up Anti-Virus service released some bad code.

      1. munnoch Bronze badge

        Re: Microsoft to blame, surely?

        Yeah, there’s a lot of Microsoft “professionals” on here seemingly taking offence at the saner minds poking them and pointing out that perhaps if MS just fucked off and died the world would be a better place.

    2. Doctor Syntax Silver badge

      Re: Microsoft to blame, surely?

      "If you cannot code defensively to ensure third party services don't take down your product when they fail, you have no place as a software engineer or a software company."

      Likewise if you can't code defensively to ensure your product isn't taken down by your own badly formatted data file.

      1. alisonken1

        Re: Microsoft to blame, surely?

        Actually, it was not just "a data file" - it was a kernel driver (filename ends in .sys).

        So yeah, it was embedded in the kernel.

        No, I don't know why a virus definition file would need to be embedded in the kernel.

        1. frankyunderwood123

          Re: Microsoft to blame, surely?

          > No, I don't know why a virus definition file would need to be embedded in the kernel.

          Well, exactly, Microsoft allow this kind of insanity.

          So many people are coming to the defence of microsoft with this excuse.

          "But yeah, it failed because it's a kernel driver"

          As if it is somehow a good idea to allow a third party to add and update a driver in the frikkin kernel on your operating system without you actually bothering to add systems which check it's not going to screw everything up. I don't care how damn complicated or expensive or time consuming it may be to have end-to-end tests checking this shit, even if it takes 24 hours to run them.

          It's better than a global outage that will result in billions of lost revenue.

          This is ABSOLUTELY the fault of Microsoft, because it's a design flaw in the OS, or rather, it's a design decision that is bat shit crazy.

          "Oh, sure, yeah, we trust loads of third party companies to shove kernel driver updates into automatic windows updates, without us checking. What could possibly go wrong?"

          "Uh, end to end tests for any updated kernel drivers? No, we can't do that, we have thousands of vendors pushing code. What could possibly go wrong?"

          I'm not saying other OS's are immune to this - of course they aren't - but you sure hope there's a level of sanity that prevents a third party from updating your OS at a kernel level without you having any oversight of it.

          FFS!

        2. Doctor Syntax Silver badge

          Re: Microsoft to blame, surely?

          Every report I've read says it was a data file to be read by an executable. I doubt the .sys suffix means anything very much. My Windows days are long gone but memory says config.sys was a data file.

    3. gnasher729 Silver badge

      Re: Microsoft to blame, surely?

      “ To take down an entire OS due to a 3rd party service failing is cowboy coding circus clown territory.”

      This isn’t any service. This is a service that is supposed to detect attacks against your machine and in the worst possible case shut the machine button.

      Think of “big red emergency button” whose purpose it is to shut down machinery in the worst possible case. You can’t then say “the machinery should have kept running even if the big red button was pressed”.

      Windows behaved exactly as it should.

      1. Anonymous Coward
        Anonymous Coward

        Re: Microsoft to blame, surely?

        No, it didn't. That wasn't a *controlled* shutdown. It didn't automatically put the machine into a recovery console on failed reboot, with a command prompt waiting for a password and commands, or a menu like the old 'safe mode / safe mode with networking / last known good'. Techies couldn't just tell a user a password to enter then commands to delete a file, as apparently Bitlocker was involved in many cases, again if recovery console can't ask for Bitlocker password then it's next to useless.. See comment about EXSi. And the one about single user OS.

        And "Couldn't that processing power instead be put into use to set the machine into some secure network listening mode so that remote technicians can do what's needed?" is spot on. For that matter, as these are business PCs, surely this is the use case that IME (despite the hate for it) was actually invented for, so why wasn't it provisioned on business machines?

        There's no perfect solution to these issues, but properly written OSs can do a lot better.

        AC due to the ill-informed hate expected from people who don't know what they're talking about and just regurgitate whatever the gutter press says.

  18. Steve Graham

    OK, so the OS has failed to boot. But note that it has the residual cognitive capacity to paint up the BSOD with some text. Couldn't that processing power instead be put into use to set the machine into some secure network listening mode so that remote technicians can do what's needed?

    Answer: Windows was written as a single-user, standalone system, and still carries the consequences.

    1. munnoch Bronze badge

      You need an executive layer built into the hardware that lets you take over from the OS, like an admin console.

      # send break

      ok # boot -s

      And if that failed SunOS actually had a mini-version of itself hidden in the boot block that you could spark up (see what I did there?) and fix the main install. I once did that from the other side of the world when my home server shit itself (leaving the install CD in the drive helped a lot...). But hey that was an architecture designed by engineers for engineers, that'd never be acceptable nowadays.

    2. Excused Boots Bronze badge

      “ Answer: Windows was written as a single-user, standalone system, and still carries the consequences.”

      Ding..ding..ding, and we have a winner!

      Exactly, look I know I have said this before, but I think it’s worth repeating, Window’s greatest strength is its ability to run 40 year old legacy software; and its greatest weakness is its ability to run 40 year old legacy software!

      Now I am fully aware that there will be a number of commentators on here who will simply roll out the tired old ‘just install Linux’ mantra. And, yes, I get it, I promise you I get it, I am as far from being an MS fanboy as you can imagine, but realistically, sorry but it just isn’t going to happen!

      Like it or not, Windows is the dominant OS in the modern world - sorry, but it really is.

      But it needs rebuilding, it needs to be rearchitected - all of the extraneous stuff is ripped out, it runs x64 applications only, in sandboxed environments and yes, I get that this cuts off any third-party AV or malware products, at the knees. And also, I get that were MS to do this and release, ‘Windows X’ which does this; then howls of protest about “my scanner that I’ve had working perfectly since 1980 now doesn’t….how dare Microsoft….it’s a conspiracy to make me buy more stuff etc,” will be something you won’t believe!

      So MS are sort of stuck between a rock and a hard place here. So what should they do?

      I think most IT professionals will agree on what they ‘should’ do, I just don’t think that Microsoft’s current management have the ‘cohonnas’ to execute on it - easier to just kick the can down the road and leave it to the next CEO to deal with.

      Cynic? Me? Absolutely not!

      It needs a full rewrite

      1. Boris the Cockroach Silver badge

        Quote

        "Like it or not, Windows is the dominant OS in the modern world - sorry, but it really is."

        And we all know how it got there

        Bribery, corruption and taking over IBM's spot as "No one is fired for buying m$"

        plus nearly every business uses orafice 365 (or version thereof)

        Its the monoculture thats the problem... as happened during the Irish potato famine, or more recently the banana thing where 1 variety got killed by a fungus? virus?

        Plus the system not having a resilance to something going wrong in the boot process and not being able to boot from the last known 'good' position

        To solve the problem m$ should rework windows.... any legacy stuff gets spun up in its own VM with data just piped in from the OS and if it crashes and burns the OS just deletes the VM and tries again.

        its not like we dont have enough resources in our computers.(lets face it 99% of all office work could take place on a WinXp machine with a 1 gig HDD and 250meg of RAM and no one would even notice..)

        1. alisonken1

          "its not like we dont have enough resources in our computers.(lets face it 99% of all office work could take place on a WinXp machine with a 1 gig HDD and 250meg of RAM and no one would even notice..)"

          I was kinda with you until you hit the 250M of ram. Unless office workers would be using WinXP embedded and not have things like browsers or office suites, I might agree.

      2. John Brown (no body) Silver badge

        "It needs a full rewrite"

        Or, at the very least, MS spend their time, money and resources on security and stability instead of futzing around with more and more spying on users telemetry, more and more advertising on the start menu and irrelevant notifications, tarting up the GUI yet again with more porcine lipstick, trying to force us to "upgrade" to 11 with more dark patterns even on enterprise installs and trying to convince the world that CoPilot is a good thing.

    3. Androgynous Cupboard Silver badge

      “secure network listening mode” - funny.

  19. Tron Silver badge

    You should always have a Plan B.

    Cash, paper forms, whatever you need so that you can run your business or life when your tech goes down. Basic resilience.

    This is what SAGE should be sorting out in the UK for stuff like the NHS, but all they do is whine about China whenever the USG prod them to.

    And by now, Windows should be robust enough not to trip over anything that is thrown at it, particularly when booting.

    Program x has an issue. Your OS will continue to load without it. Program X will not be working. etc. How come their OS is still not able to detect and pull out of a loop in 2024?

    If they had stopped at Win 7 and spent the entire time since making it more secure and resilient, we would all love them. Instead they increase the complexity, lose the resilience and add pointless AI crap.

    1. Anonymous Coward
      Anonymous Coward

      Re: You should always have a Plan B.

      Depends on the business.

      Healthcare? Better find a way to keep going.

      Office work? Give everybody a day or two off. Whatever you're doing probably wasn't important anyway. And when that's over, give the IT folks a few days off, because they're gonna need it.

  20. jeremya

    The fault is actually systems administrators

    Any systems administrator who allows automatic patches to his network hardware is fundamentally incompetent.

    They are even more fundamentally incompetent if they have no disaster recovery and rollback plan.

    This is one of the instances where running most of your systems virtually has a massive advantage in that a rollback can be done with a few clicks - assuming the rollout process has snapshots before any patching.

    I travel hopefully there will be a lot of firings of sysadmins. The sad reality is they will all be feted as heroes for recovering the borked systems.

    1. OhForF' Silver badge
      Alert

      Re: The fault is actually systems administrators

      s/systems administrator/CISO/

      In any organization bigger than a SME the sysadmins that now have to fix devices stuck in a boot loop are not those that have the power to veto automatic patching of the "endpoint security solution" mandated by the cyber security staff. You should hope for the CISO and his staff having to answer what the benefit of their choosen security solution is if it disrupts business as much as a cyber attack and forces system admins to execute the disaster recovery plan.

    2. Anonymous Coward
      Anonymous Coward

      Re: The fault is actually systems administrators

      So allow security patches and systems go down - Sack the sys admin

      New zero day and network gets owned - Sack the sys admin

      Which is it going to be? You can't have both on the sorts of networks and systems this has hit. You can do some testing and for major updates you do but trivial updates. Can you imagine the resource required to test every single update for every single piece of software on every configuration of hardware and software? You need to have some trust that some vendors aren't going to screw up your systems. You need to be able to trust some security updates and security patches because they are time sensitive and the time spent testing leaves you open to attack.

      1. Doctor Syntax Silver badge

        Re: The fault is actually systems administrators

        "You need to have some trust that some vendors aren't going to screw up your systems."

        People did. This happened.

  21. Anonymous Coward
    Anonymous Coward

    By coincidence

    I was waiting for a significant bank transfer on Friday.

    Proceeds from a house sale.

    Right in the prime of the IT issue.

    But, it arrived safely early Saturday morning.

    Praise be to the FSM.

  22. Quiller-Nine

    Gobbling profits in Turkey

    Me and the family got stuck at Istanbul airport, our flight back to Sweden got cancelled and we ended up having to book a hotel at the airport. I started looking for hotels when it became clear that we weren’t gonna fly out on Friday and the prices were around €200 per night, within two hours - the amount of time we were stuck arguing with airport/airline bureaucrats - the price went up to €500 for the fucking night.

    1. Doctor Syntax Silver badge

      Re: Gobbling profits in Turkey

      The €200 deals were snapped up by those who didn't waste time arguing.

  23. harmjschoonhoven
    Facepalm

    But, but,

    How many of the 8.5 million+ borked devices were open to the internet while that was avoidable?

    And how many needed connectivity only because a stable stand-alone Windows® OS is an oxymoron?

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like