back to article Angry admins share the CrowdStrike outage experience

IT administrators are struggling to deal with the ongoing fallout from the faulty CrowdStrike file update. One spoke to The Register to share what it is like at the coalface. Speaking on condition of anonymity, the administrator, who is responsible for a fleet of devices, many of which are used within warehouses, told us: "It …

  1. Julian Poyntz

    Holidays

    Still, at least it is the summer holidays in the UK now, so lots of parents off to look after the kids - less urgency to sort their kit out.

    1. hoola Silver badge

      Re: Holidays

      It is affecting global payment systems, being on holiday is irrelevant.

      Any company that uses CrowdStrike is affected. That includes anyone relying on a SaaS service where that provider uses it.

      1. Julian Poyntz

        Re: Holidays

        Yes, but servers could/should have snap shots so you can revert, or admins maybe able to get access for manual intervention (appreciate not always). Some servers will require care. This is where a company, you would hope, would be directing most of their resource to right now.

        EUC is a big problem, especially with WFH., In the old days, if this happened we could walk into a floor of users, hit a few banks of desks at the same time and get through it a lot quicker. Can't do that over the phone, and with offices now utilising a lot of hot desks, you may go to 2 banks on Monday morning and find 20 non working users, but on Tuesday it maybe a a scattering of 3 over the same 2 banks - so you are almost doing a 1to1 fix, not enmass

        As a lot of people will now be taking time off, there is going to be less stress for the IT teams as quite a few of their users will be away coimpared to everyone being, or coming in and demanding to be fixed NOW.

        1. Paul Hovnanian Silver badge

          Re: Holidays

          "as quite a few of their users will be away"

          So IT support will be getting calls for a few weeks when WFH users return and find that their PC has been frozen while on vacation. The lucky users shut the thing off and will miss the FUBAR ClownStrike update altogether.

          It could have been worse. In the old days of CRTs, users could return to a BSOD burned in to the phosphor.

          1. handle handle

            Re: Holidays

            If the computer was turned off while on holiday, the faulty update would be skipped! Another good reason to unplug during vacation.

            1. Munchausen's proxy
              Pint

              Re: Holidays

              > If the computer was turned off while on holiday, the faulty update would be skipped! Another good reason to unplug during vacation.

              It seems to me to be another very good reason to not invite strangers to manage your critical infrastructure over a network.

            2. Dimmer Silver badge

              Re: Holidays

              I run a script triggered by logoff, sceen lock or disconnect from rdp from schedule task that changes the gateway so my system stays only on the local net when not in use.

              When logged back on, another script that puts the gateway back.

              Anything I want my system to connect to while logged off is provided by adding the specific route to the script.

              Maybe I was not so paranoid after all.

      2. NoneSuch Silver badge

        Re: Holidays

        Am seeing a lot of 'holier than thou' posts on Twitter from Mac and Linux users castigating anyone with a Windows PC and laughing.

        This sort of thing can happen to anyone and it's only a matter of time before it happens to you, regardless of what flavor OS you chose.

        Stop being a d*ck.

        1. Anonymous Coward
          Anonymous Coward

          Re: Holidays

          Try running Checkpoint on Ubuntu. Periodic desktop freezes, and an unbootable machine after an update. Only fixable as a) I'm a dev, and can get into grub to boot an old kernel, and b) drive is encrypted with LUKs, so i just needed the normal unlock password to mount the root partition to enable the grub menu.

          1. Anonymous Coward
            Anonymous Coward

            Re: Holidays

            We have 1 client running crowdshit, their production systems have been down all day.

            The client was extremely happy with their IT and had no complaints about any of their computers. They were forced to accept another IT team taking over due to corporate politics.

            The other IT team forced us to remove the Microsoft antivirus and install crowdshit. Users immediately started complaining that their computers were constantly crashing and grinding to a halt. The other IT team confirmed that it was a known issue with crowdshit that they have with all of their systems. Their solution was "when your computer gets so slow it is unusable, reboot".

            They literally will not give us the codes required to uninstall it, even temporarily for testing, because they know the operations manager will order us to "get that shit off of every computer" and we will happily comply.

            The other IT team fucked up big time with this incident. They messaged the users directly, taking full responsibility for the issue and stating that they were working on a fix. Of course they forgot the part where they are meant to follow up with instructions for how to fix the computers. They also forgot to transfer some of the responsibility on to us by like telling us about it or something.

            We are probably the only IT company who spent today looking at a list of offline computers and laughing our arses off!

            I wish I could listen in on the "why are we STILL down???" phone call the tech-literate manager who knows the fix and what time it was announced will be making on Monday...

            1. ecofeco Silver badge

              Re: Holidays

              I'm seeing this stupid shit far too often. Giving control of all the company systems to an outside vendor and getting rid of the in-house IT staff.

              You cannot fix that kind of fucking stupid.

              1. Anonymous Coward
                Anonymous Coward

                Re: Holidays

                (same poster)

                It's actually an in-house team replacing the outside vendor (us) that used to have "sole jurisdiction" over one part of the business...

                The 2 IT teams issue is due to the migration having been on hold half way through due to "capacity issues" limiting their ability to handle the increased support demands since crowdstrike was installed on the machines. Their IT team is several times bigger than our company.

                I have not personally been involved in that side of things, I discovered the reason they were citing capacity issues for a client that used to send 2-3 support tickets a week only today when the subject came up...

            2. informed

              Re: Holidays

              It affected just one of my clients as well. Their parent company had insisted CrowdStrike was installed. A little bit of serendipity helped them. They're only a small company and I'm a lone IT support engineer. So they don't have automated tools for updates/installs - the users had to install CrowdStrike manually. I've been on extended holidays for 3 weeks and only 1/3rd of them had installed it. If I'd not been off the grid, I'd have been hassling the others to install and therefore the impact would have been bigger.

              Still they're having to pay me for a call out on Monday to rebuild one of the desktops which they just can't get into recovery mode/remember Bitlocker key.

            3. anothercynic Silver badge

              Re: Holidays

              Sounds very familiar...

        2. Rich 2 Silver badge

          Re: Holidays

          While I understand the sentiment of your comment - people shouldn’t gloat about this stuff - I’m afraid (actually I’m not) it is almost ALWAYS Windows that’s fucked up by this sort of thing. It’s NEVER any other OS (well it might be, but the percentage is so low it’s not even statistical noise)

          The fact that anyone feels the need to run stuff like Cloudstrike just to keep the OS up and running is a very long-standing joke - there is something deeply wrong with an OS that needs this stuff to keep it going

          1. HereIAmJH Silver badge

            Re: Holidays

            While I understand the sentiment of your comment - people shouldn’t gloat about this stuff - I’m afraid (actually I’m not) it is almost ALWAYS Windows that’s fucked up by this sort of thing. It’s NEVER any other OS (well it might be, but the percentage is so low it’s not even statistical noise)

            I have a system where pacman said "hold my beer"..... Apparently it doesn't have a dependency tree for libraries? To avoid removing ones that other packages rely on. And pacman-static doesn't help when the server no longer finds the NIC. IE it knows the hardware is there, but won't assign an interface to it.

          2. tango_uniform

            Re: Holidays

            Ah, the crux of the issue with Crowdstrike...why do you need it at all? Maybe some CxO in Mahagony Row fell for the marketing? Maybe some faceless "cybersecurity" auditor recommended it as best-of-breed? Maybe some insurance underwriter demanded it?

            The way I see it, most of this garbage should already be part of the OS, not some add-on. BUT NOOOO...we have to "let the market determine what's best". If Microsoft hadn't been forced by legal decree to write in hooks for 3rd-party "security products" we'd have a smaller pool of idiots to blame for this kind of cock-up. A pool of one: Microsoft. Certainly, that'd sharpen up peoples' principals knowing that trusting Windows means trusting Microsoft, and only Microsoft, with your crown jewels.

            The way it stands now, people will throw whatever 3rd-party product that promises to make up for the lack of Windows security into their infrastructure and sleep well at night knowing that all that money they spent doing so isn't going to Microsoft, but to a vendor that is "smarter" and "more agile". If that were the case, why aren't these vendors PUBLISHING safer, more more secure OSes? See the problem?

            I'm not advocating for the demise of Windows but simply pointing out the rather obvious fact that a Zero-Trust Architecture means you don't trust anything, INCLUDING ALL OF YOUR SOFTWARE AND SERVICES VENDORS.

            I'm hoping that this is a wake-up call to those who think that writing a check means that they don't have to think about security anymore.

            1. Anonymous Coward
              Anonymous Coward

              Re: Holidays

              "The way I see it, most of this garbage should already be part of the OS, not some add-on."

              You want the same people who wrote the 'Flaky OS' to also write the 'defense against the arts' software as well !!!???

              They cannot write an OS without so many holes/issues/bugs *but* apparently *can* write the 'defense against the arts' software that protects those holes/issues/bugs.

              If they were that good, why not 'fix' the OS in the first place !!!???

              Going 3rd party is reasonable [Different set of eyes/minds etc] *BUT* do not trust any claims 100%, all software vendors lie.

              Not because they are bad *but* that is the industry we have allowed to grow, we accept these lies everyday, pay good money for them & come back for more !!!

              ---------------------------------------------------------------------------------------

              "I'm not advocating for the demise of Windows but simply pointing out the rather obvious fact that a Zero-Trust Architecture means you don't trust anything, INCLUDING ALL OF YOUR SOFTWARE AND SERVICES VENDORS."

              Correct, you test for everything you can and have protection in depth.

              i.e. Don't trust the claims of *anyone* and have backups/roll-back images/etc to allow you to recover from *failed* recoveries by the 'defense of the arts' software.

              THIS SHOULD BE THE PRIMARY LESSON THAT HAS BEEN TAUGHT BY THIS FIASCO !!!

              :)

              1. RichardBarrell

                Re: Holidays

                Same organisation doesn't mean the same people. Microsoft in particular have REALLY high variance in the quality of the software they ship.

                1. Strahd Ivarius Silver badge
                  Trollface

                  Re: Holidays

                  Does it goes from bad to worse?

            2. anothercynic Silver badge

              Re: Holidays

              In the UK: Cyber Essentials compliance.

              1. collinsl Silver badge

                Re: Holidays

                That's a joke in itself. To get CE+ compliance you have to install the latest browser. When we had the scan done we failed initially because Google released a Chrome update the day before the test and we hadn't packaged the replacement up yet. We also had to remove IE from all the machines being scanned because their scan detected it was "out of date" despite the fact that MS had stopped updating it & were forcing everyone onto Chrome.

                But we could pass because a) the testers let us choose which laptops to test and 2) we could run the test as many times as we liked until it passed, and then give them the pass results only.

                1. jeremylloyd

                  Re: Holidays

                  Your CE+ assessors clearly didn't have a clue. You have 14 days to install updates from release. The assessor has to choose the test devices. You are not allowed to run the tests - the assessor must do this.

          3. ecofeco Silver badge
            FAIL

            Re: Holidays

            Oh we ARE going to gloat, BECAUSE there is something deeply wrong with an OS that needs this stuff to keep it going and we've been warning about it for years.

            And this WILL keep happening.

            1. Stubbly Dude

              Re: Holidays

              Gloat away, it won't change the penetration of the OS in the corporate world, or even the home one...

              1. teknopaul

                Re: Holidays

                I rekon a few old PC will get replaced by one or more android device.

            2. hoola Silver badge

              Re: Holidays

              Luck not judgment is why other stuff is not affected.

              Othe operating systems also need security solutions but there is this occult that believe all non Windows operating systems are invulnerable. They are not, particularly as the attack vectors are increasingly moving to the applications.

              Complacency just because this group thinks they are somehow invulnerable is what leads to catastrophe. Everyone needs to sharpen up on this and infosec teams need to start listening more to technical teams and not sales people or Gartner.

              Infosec Teams are the cause of some crazy risks because all that matters is ticks in boxes.

              The entire debacle should be a huge wake up call (on top off the other recent attacks) to the tech sector. Sadly that is unlikely to happen, no lessons have been learnt from previous fiascos. That CrowdStrike are likely to escape without being sued out of existence is even more depressing.

            3. Joe W Silver badge

              Re: Holidays

              You will gloat until this becomes part of systemd.

              (and then I will totally gloat, running bsd and Devuan - until stupid stuff hits my systems, but since I no longer run Debian Sid chances are slim)

              1. m4r35n357 Silver badge

                Re: Holidays

                Coz we are all clamouring for more Systemd here!

                Systemd IS M$.

              2. Tridac

                Re: Holidays

                Systemd, the reason I dumped Linux for anything serious. Offends every software engineering principle. If you look at the sources, it's a real mess of a trainwreck. Looked the one module, network.c, from memory and it's a thousands of lines of C module, pulling in upards of 100 header files. Absolute garbage, and big business depends on this ?. Originally from a Solaris and VMS environment, when an os was written by engineers, for engineers. FreeBSD for several years now, and never a serious issue at all.

                When you consider that crowdshite is an enterprise class employee spyware program, looks lie karma finally caught up with them. Serves them right, hit hard in the pocket, is the only thing they understand...

          4. Francis King
            Mushroom

            Re: Holidays

            That's not my experience of Linux.

            Twice now updates have nuked my system - replacing a fully functional graphics card driver with one that doesn't work.

            It's not just Windows that is improved to death.

            1. phuzz Silver badge

              Re: Holidays

              About the best I can say about Linux, is that a bad update generally only takes down one specific type of hardware or configuration at a time, and the rest are ok. So at least a bug in the 5.15 kernel's amdgpu driver only took down twenty machines last week, instead of the whole estate. Although I do deserve some of the blame for not testing every possible hardware combo.

              1. Anonymous Coward
                Anonymous Coward

                Re: Holidays

                phuzz,

                Blame is exactly what you should accept !!!

                Why the hell do you roll out something you do not test ???

                You get away with this once/twice maybe more if you are *very* lucky ...

                *BUT* you will be bitten eventually ...

                at the worse time and it *is* ALL your fault.

                Every IT techie I have ever worked with thinks they are the *BEST* and *infallible*.

                I lack that ego and can make mistakes, like everyone else, *but* not because I think I cannot make them !!!

                :)

                1. phuzz Silver badge

                  Re: Holidays

                  I wasn't thinking I was "*BEST* and *infallible*", I was thinking that I've never seen a Linux kernel update (and this was a mainstream release, not a beta/canary/unstable release) fuck a computer hard enough to stop it booting. Turns out kernel devs are just as fallible as the rest of us.

              2. teknopaul

                Re: Holidays

                I guess I'm lucky, I dont remember a Linux update doimg any damage within one version . Major version update sometimes are a headache but I have never seen a yum / apt security update stop a physical pc from booting

                1. collinsl Silver badge

                  Re: Holidays

                  Whereas I've seen it (very rarely) on RHEL6 boxes where an improperly installed kernel update prevented the machines from booting - booting into the previous kernel and reinstalling the new one resolved it 100% of the time but that machine is then down until someone intervenes.

          5. Stubbly Dude

            Re: Holidays

            Yea but in case you hadn't noticed Windows is pretty ubiquitous in the world of corporate IT. That isn't going to change any time soon..

            1. lockt-in

              Re: Holidays

              And it’s not going to change until Governments mandate something that enables competition. Sadly there are too many brown envelopes.

              1. Peter2 Silver badge

                Re: Holidays

                This has nothing to do with brown envelopes. Linux fails despite being being free because the lack of cost is not enough to offset the missing functionality in software that businesses need, and therefore are willing to buy software that saves them much more money than the purchase or license cost.

                This is a comment I made on this subject literally ten years ago. It's still basically correct today.

          6. StudeJeff Bronze badge

            Re: Holidays

            Naturally it's Windows that gets hit with this kind of thing the most often, the vast majority of companies use it.

            And you're right, this could have happened to any OS that Crowdstrike supported... well, in this case I'm not sure "supports" is the right term!

        3. Headley_Grange Silver badge

          Re: Holidays

          My home PC is a Mac and I'm not one of the annoying fanboi's who insists that Macs are safe from everything including a direct hit from a nuclear missile because <insert spurious belief>. I'm well that I'm just one lazy click away from days of pain so, I just brought forward my quarterly "off-site" backup (it's in a plastic bag in a jam jar in the shed) and updated the reminder to do it monthly instead of quarterly. I've also made sure that the written copy of the disc encryption key is where I thought it should be and, just to be safe, done a separate backup of my password manager to an encrypted stick and hidden it somewhere. Tomorrow I'll be getting my fallback MBA out of the loft and getting the files up to date.

          1. katrinab Silver badge
            Gimp

            Re: Holidays

            It is a heck of a lot easier to recover from a time machine backup though.

            1. Headley_Grange Silver badge

              Re: Holidays

              My shed backup has a Time Machine backup on it and an rsync copy of my files (photos, docs, music, etc). I have a number of other Time Machine backups running to Apple devices and a NAS and also some stashed removables, but I also run rsync copies as well. This comes from a hard lesson the first time I tried to restore a brand new Mac from Time Machine because the OS's were so far apart that the new Mac required the old Mac to be updated before it continued. I couldn't do this because the old Mac wasn't compatible with the latest OS. I can understand that settings and apps from an older OS might not be compatible but it wouldn't let me bring anything across. I cabled the macs together and copied across my files but if my old Mac had died completely I would probably have lost data, although I assume there would have been some way to access the Time Machine files. Since then my personal backup policy has been Time Machine dailies plus Time machine and rsync monthlies to removeable drives. I also keep the Mac OS up to date (I had a bee in my bonnet about something in the above situation and refused to update), although I always wait a week after it comes out.

              I'm not slagging Time Machine off - it's easy to set up, reliable and invisible to the user once running and rolling files back to earlier states is easy and the UI is oddly satisfying.

              I'm sure there's a pithy saying for this in IT circles, but if you've never tried to recover from a backup then you can't really be sure it's a backup.

              1. Steve Aubrey

                Re: Holidays

                "You don't test backups - you test restores"

              2. tony

                Re: Holidays

                Reminds me of the time I was asked to look at a small offices IT system, mid 90’s, they’d been using zip discs to back up documents. Well 1 zip disc with a year/month/date directory structure and each one contained a shortcut to my computer…

              3. FlippingGerman

                Re: Holidays

                I had this issue at work recently as the person most likely to solve computer problems. I actually did manage to convince an old Mac to go to a newer - not current, it’s about 12 years old - version and then restore everything. It required fully wiping and reformatting it, then updating the “to” Mac.

                It was certainly harder than I wanted and even expected, given how easy TM is to set up.

                I’ve never had to do the same to Windows; I don’t imagine that’s great fun either.

              4. W.S.Gosset Silver badge

                Re: Holidays

                You can actually write your own timemachine in shell -- I did this in 1996.

                Basically, it's just an ongoing series of datestamped folders, each with the entire filetree underneath them but hardlinks for unchanged files rather than copies.

                You use find to walk your entire current filetree* and at each node look at the most recent datestamped copy. If it doesn't exist in the prior run, you use cp/mkdir to copy this node into the new filetree; if it's a file and exists but has changed, you copy it into the new filetree. If it exists AND hasn't changed, you use ln to create another catalogue entry for it in the new filetree.

                That's it.

                .

                * (Put some filters on the find for handling OS "special files" if you're doing whole-machine replication.)

                1. W.S.Gosset Silver badge

                  Re: Holidays

                  The nice thing about doing it manually is you can then create custom schedules for people with special needs. Eg, graphics/video artists on big projects wanting, say, 30min backups intraday, then 4hourly for yesterday, then daily for a week, then weekly for 2 months, then collapse to more-normal. This becomes a simple case of 30min TMs and a culling script to run daily.

                  I have vague recollections of rsync having an option to do this built-in, too. That is, hard-linking unchanged "copies" rather than re-creating the file as a new copy. So that could be worth looking at if you already have an rsync process set up.

                  1. phuzz Silver badge

                    Re: Holidays

                    I don't know about other snapshot systems, but Windows let you specify a shadow copy schedule per folder if you wanted to

                  2. collinsl Silver badge

                    Re: Holidays

                    Or you could insert a ZFS filesystem or similar and take snapshots on a schedule. Can be as simple as two disks in a mirror if that's all you need.

                    1. Tridac

                      Re: Holidays

                      Been using zfs right from the early Solaris 10 release. FreeBSD for several years now, which has zfs and lightweight virtualisation (jails) out of the box and looks like it was modelled on the Solaris ideas. Always chose lts versions for initial install. Once a system is stable, all the all the required packages etc, never update anything, but the machines are on a well secured subnet, with any windows rubbsih on a separate subnet on its own. Separate hardware interface for each subnet. Never had a virus or successfull attack in over two decades now. Would never even consider windows for server work, or anything critical to the business. More trouble than it's worth...

          2. ITMA Silver badge
            Devil

            Re: Holidays

            What?

            No Faraday cage to keep your backups in?

            Shame on you.

            1. Headley_Grange Silver badge

              Re: Holidays

              On the contrary.

              https://www.asgardsss.co.uk/5-x-11-metal-garden-shed

          3. Stubbly Dude

            Re: Holidays

            Yes I can see this strategy working well in a multinational corporation. Do you have enough jam jars to go round?

            1. ITMA Silver badge
              Devil

              Re: Holidays

              Leyden jars.... :)

        4. Anonymous Coward
          Anonymous Coward

          Re: Holidays

          In Linux land, Grub keeps emergency OS Kernels that you can revert to when stuff doesn't work. Windows used to do the same, but "Safe Mode" doesn't work any more...

          1. Solviva

            Re: Holidays

            It's not Grub that keeps old kernels around - Grub simply offers whatever kernels it finds during grub-mkconfig. It's your distribution that keeps a (typically) previously working kernel around fot Grub to find and offer to boot from.

            1. Mike Pellatt

              Re: Holidays

              Ah, those happy days when I needed to use that probably 2 or 3 times a year, or more. Of course, it was LILO, not Grub back then. And it was local kernel builds.

              Don't think I've needed to recover by booting from an old kernel in this millennium.

        5. rskurat
          Meh

          Re: Holidays

          Stop being a d*ck? on Twitter???

        6. Anonymous Coward
          Anonymous Coward

          Re: Holidays

          This sort of thing "could" happen to anyone, but for some reason it only DOES happen to Windoze users. And it happens to them a lot.

          So no. I won't stop being a "d*ck" about it. Your operating system is shit.

          (But I certainly don't use Xitter.)

          1. collinsl Silver badge

            Re: Holidays

            Because Windows is the majority OS. If Linux was the majority desktop OS then there would be more large-scale failures. If North Korea's Red Star OS was the leading desktop OS then that would have many more failures too.

            1. lockt-in

              Re: Holidays

              Need to fix this monopoly situation, there are getting to be too many disasters like this, and there will be more and worse.

            2. Anonymous Coward
              Anonymous Coward

              Re: Holidays

              That's the excuse Microsoft tries to sell you (every damn time), but then the failure rates would match up with the distribution between Operating Systems and it does not, not by a long shot.

              The MTBC (Mean Time Between Cockups) of Microsoft products in general is way shorter than on other platforms, and it's easy to see why. If people keep buying it anyway, why bother? That's also why they could do away with testing - users now do it. The quality of their new products provides ample demonstration: 'new' Teams and Outlook were so bad they should not even have been released as a beta, now they don't care and sort of fix it on the fly.

        7. Anonymous Coward
          Anonymous Coward

          Re: Holidays

          You'd be mighty put out if you went and looked and there was nobody being a d*ck over your predictable predicament. You might even start to wonder. Could there be something wrong with the software you're using? Surely not.

        8. Anna Logg

          Re: Holidays

          i'm aware of a system with 500 linux servers that went down because the server room PSU management PC ran Win10...

        9. NoneSuch Silver badge
          Coat

          Re: Holidays

          (No one will ever see this, but here it is for the record.)

          Two months later and a 9.9 CVSS for Linux is about to be released. Just leaving it here for all those peeps who downvoted.

          https://www.theregister.com/2024/09/26/unauthenticated_rce_bug_linux/

    2. Kimo

      Re: Holidays

      Absolutely. I haven't touched my Uni laptop for over a month. Campus systems were hit hard but we are always light on faculty and staff in summer. During a semester we would have more labs and employee computers hit.

  2. boboM

    Beyond me

    how anyone is running anything mission critical on a toy operating system is beyond me, especially when cheaper and better alternatives are available. OSes with a much more secure user/file model at that. It's sheer incompetence or lack of knowledge.

    1. hoola Silver badge

      Re: Beyond me

      Since when has CrowdStrike been an OS?

    2. Anonymous Coward
      Anonymous Coward

      Re: Beyond me

      Just like in science, progress will occur one dead body at a time. The MS crowd displaced the mainframers as the dominant IT admins. Most of them have never used or tried another OS. The younger admins I meet have grown up in a world of Linux, Android, macOS, iOS, and even the BSDs, where MS's dominance is only in business IT and some vertical markets. When their bosses retire one way or the other, things will change.

      1. hoola Silver badge

        Re: Beyond me

        And so will the threat landscape......

        Once the focus is on a different OS as that has become dominant then so will the attacks.

    3. Matt Dainty

      Re: Beyond me

      Those more secure OSes can also run the CrowdStrike software, often mandated by your friendly InfoSec department. It's just luck that this update only hosed one OS and not all of them.

      1. Doctor Syntax Silver badge

        Re: Beyond me

        With a bit of luck InfoSec should new be looking at the decision and asking if it increases or decreases the risk.

        1. Jellied Eel Silver badge

          Re: Beyond me

          With a bit of luck InfoSec should new be looking at the decision and asking if it increases or decreases the risk.

          Or even if they understand the risk-

          "We can't boot into safe mode because our BitLocker keys are stored inside of a service that we can't login to because our AD is down."

          Oh dear, how sad, never mind. This kind of failure should not be possible. Where are the keys? What happens if/when you can't access that server? Wouldn't it be a really good idea if those critical keys were somewhere where you could access them, if/when your AD has a bad day? Which has happened numerous times in the past.

          1. Richard 12 Silver badge

            Re: Beyond me

            The assumption has usually been that only one or two machines get hosed any given week.

            I'm pretty sure that nobody ever considered losing 80% (or more?) of the estate in one fell swoop.

            1. Ken Hagan Gold badge

              Re: Beyond me

              Possibly true, but quite perverse considering the time and effort some people put into constraining PC fleets so that they all behave the same way.

              For such people, if the PCs don"t all fall down at the same time then IT hasn't done its job properly.

            2. Doctor Syntax Silver badge

              Re: Beyond me

              "I'm pretty sure that nobody ever considered losing 80% (or more?) of the estate in one fell swoop."

              Anyone doing proper disaster recover/business resilience will have planned for any or all critical servers being lost along with at least some of the workstation fleet.

            3. Vince

              Re: Beyond me

              Speak for yourself

              For those of us who are boring and get on with IT and not absorbed in the latest fads and AI willy waving designing around really bad scenarios is entirely what you do.

              Plan for the worst, hope for something less intense to go wrong... But expect something to absolutely go wrong.

          2. GeekyOldFart

            Re: Beyond me

            "We can't boot into safe mode because our BitLocker keys are stored inside of a service that we can't login to because our AD is down."

            On my site I opened the (hard copy, in the safe) "oh shit" file to get the relevant local admin password, left my office, walked briskly down the corridor to the onsite server room and got one DC up in safe mode to do the fix. Then I changed that single-use local admin password before heading back to my office and updating the hard copy file with the new password before locking it away again.

            Meanwhile the rest of my team were making use of the one DC I'd resurrected to get all the other impacted servers up into "safe mode with networking" now that they could talk to a DC, allowing them to login with their domain admin accounts AND access the bitlocker keys and perform the fix.

            Once we had the AD infrastructure up and running the desktop support folks went into high gear busily fixing all the impacted workstations and laptops

            A similar story played out on all my employers sites worldwide and we had pretty much every server - even the non-critical ones - back online before noon UTC and 99% of workstations and laptops fixed by midafternoon. None of which would have happened that fast without that hard copy file. Sometimes the best tech solution is decidedly low-tech :)

            1. Doctor Syntax Silver badge

              Re: Beyond me

              It's called planning.

          3. Doctor Syntax Silver badge

            Re: Beyond me

            "Wouldn't it be a really good idea if those critical keys were somewhere where you could access them"

            Somewhere like a write-protected medium locked in the safe.

          4. Anonymous Coward
            Anonymous Coward

            Re: Beyond me

            ""We can't boot into safe mode because our BitLocker keys are stored inside of a service that we can't login to because our AD is down.""

            I could not get my head around ^^^^

            Something *so* critical is *NOT* to be stored inside something that can fail and render access impossible !!!

            At a minimum you should *also* have copies on multiple media [not just usb sticks please !!!] that can be read by the simplest OS or even a piece of software like 'Winhex'/'dd'.

            No extra encryption just plain old phyical security .... i.e. put it in a safe ... in multiple *SAFE* places. [Pun intended !!!]

            :)

        2. JaneGnrrr

          Re: Beyond me

          A huge bit of luck. I already hear (some of) them saying „but it only affected availability!“

      2. Anonymous Coward
        Anonymous Coward

        Re: Beyond me

        You imply that a similar error with their Linux code would also make a Linux system unbootable, and difficult to recover..

        There is more to an OS architecture then the logo and font it uses.

      3. Anonymous Coward
        Anonymous Coward

        Re: Beyond me

        it DID happen on Linux also, but was easily fixed

        https://www.neowin.net/news/crowdstrike-broke-debian-and-rocky-linux-months-ago-but-no-one-noticed/

    4. Alan Bourke

      Re: Beyond me

      Tell me you haven't got the first notion about modern Windows etc. Why didn't you put a 'Micro$oft' in there as well like it's 2002? If the cheaper alternatives proved to be actually better in the corporate world outside your bedroom then everyone would be using them.

      1. Anonymous Coward
        Anonymous Coward

        Re: Beyond me

        "Tell me you haven't got the first notion about the corporate world etc"

        At every large company I've worked at, upper management with no IT knowledge have forced through "upgrades" that involve switching to windows.

        You don't get cheaper alternatives with big PR firms smooshing the clients with fancy perks and gifts.

        1. rskurat

          Re: Beyond me

          bribery does have a tendency to work

          1. Anonymous Coward
            Anonymous Coward

            Re: Beyond me

            And this is why cheap OS won't get dominance...

      2. Doctor Syntax Silver badge

        Re: Beyond me

        Tell me you haven't got the first notion about beyond modern Windows

      3. Anonymous Coward
        Anonymous Coward

        Re: Beyond me

        You mean Micro$hit. Because that's what it is, a shit excuse for an OS.

        And the "modern" version keeps being enshittified, every release is worse than the last now. The latest "modern" Micro$hit "innovation" is putting AI spyware in every machine.

        The ONLY reason the "corporate world" is still using Micro$hit at all is inertia and bribery. The only reason IT likes Windoze is because it's far more work to keep it from constantly falling over, it's a misguided attempt at job security. And these massive worldwide Micro$hit Windoze failures are the result, because corporate replaced people with 'management' garbage software.

        1. hoola Silver badge

          Re: Beyond me

          Maybe if all those open source solutions provided something that actually competed including with commercial support then it would make a difference.

          The alternatives are simply to disorganised even after decades of trying. Commercial users want SLAs, contracts. support organisations and integrations.

          It is very little to do with bribery. We use both Windows and Linux. Linux has to be a commercial distribution, that makes it the same as Microsoft. The product runs on both platforms however there are still things that are far better on Windows.

      4. Zack Mollusc

        Re: Beyond me

        When you finally leave school and look at any decision made by any company ever, you will realise what a stupid remark that was.

    5. lostinspace

      Re: Beyond me

      Redhat pushed out an update a while back that broke grub and required manual intervention to fix any system that rebooted after applying the update.

      And that was the OS vendor.

      This wasn't even Microsoft, but a third party.

      I've also had various other updates break services on Linux VMs, so no OS is immune to these things.

      1. John Brown (no body) Silver badge

        Re: Beyond me

        "This wasn't even Microsoft, but a third party."

        I think with Redhat being the OS vendor, that makes them the 1st party :-)

        1. Stevie

          Re: This wasn't even Microsoft, but a third party

          I think the OP meant Crowdstrike, not Redhat was he 3rd party.

    6. jailbird

      Re: Beyond me

      Actually, the Windows user/file security is much more advanced than normal POSIX ACLs, it's why NFSv4 ACLs were modeled off of it.

      That doesn't mean that Microsoft actually USES them like they should, though.

  3. wolfetone Silver badge
    Pint

    "We can't boot into safe mode because our BitLocker keys are stored inside of a service that we can't login to because our AD is down."

    For as much as Crowdstrike have royally fucked up here, hopefully from Monday there will be a deep discussion and investigation in to why so many companies have been caught with their trousers down regarding disaster recovery.

    But yet again, for all the techs having to deal with this, here's a pint.

    1. hoola Silver badge

      I am guessing the first issue is stopping all your DR services promptly running the f****d update.

      Until you understand what is causing the outage you cannot implement DR.

    2. KittenHuffer Silver badge

      All disaster recovery plans shall be reviewed and tested, all software releases/updates shall also be reviewed and tested, and all pigs shall be fed and ready to fly!

      1. TimMaher Silver badge
        Coat

        Pigs.

        We need a flying pig icon!

        C’mon El Reg! Take the one in my pocket.

        1. Jellied Eel Silver badge

          Re: Pigs.

          We need a flying pig icon!

          Speaking of pork. I've heard there are desperate recruiters offering £10k a day + expenses for anyone with a passport that can start now. Also possibly travelling by private jet given airlines seem to have been hit pretty hard. Kinda curious what risks that might have, ie which systems have been knocked out affecting airlines, so whether private pilots could file flight plans, manifests, or just end up caught in the same mess.

          1. Doctor Syntax Silver badge

            Re: Pigs.

            I took a look at FlightRadar yesterday afternoon. Traffic was a bit light but still reasonably busy. One thing that struck me when I looked was the track on one of the planes coming into Manchester. It had executed a peculiar loop around Hyde which is where they normally line up for the runway and a following plane had executed a loop a bit further back, neither in the usual holding locations. Clearly something had temporarily held things back. Whether or not it was Cloudstrike I don't know but I've not seen that one before.

      2. Jellied Eel Silver badge

        All disaster recovery plans shall be reviewed and tested, all software releases/updates shall also be reviewed and tested, and all pigs shall be fed and ready to fly!

        This also applies to ClownStrike in spades. Especially if any of their customers have managed to get testing and consequential losses written into their contracts. ClownStrike knows their users environments cos it has it's software on those systems monitoring this. So how ClownStrike managed to push an update without noticing this bug is something of a mystery, especially as it's managed to infect and corrupt thousands of customer's systems.

        1. Anonymous Coward
          Anonymous Coward

          Clown Strike

          https://m.youtube.com/watch?v=a21DTmOP9Yw&pp=ygUcY2xvd24gc3RyaWtlIGVsdmlzIGNvc3RlbGxvIA%3D%3D

      3. hoola Silver badge

        Automated updates from a SaaS service like CrowdStrike cannot be tested in isolation.

        1. Phil O'Sophical Silver badge

          Automated updates from a SaaS service like CrowdStrike cannot be tested in isolation.

          Why not? You have a list of sacrificial test systems which are permitted to get the update on day 1. After it's complete they run a functional test suite to make sure that all is behaving as expected.

          After that, a report is sent to the admin who can decide whether the automated update is provided to the production systems. If you're willing to take the risk you could make that second phase an "automatic unless the admin says no" option, although I wouldn't.

    3. Mike 125

      > why so many companies have been caught with their trousers down regarding disaster recovery.

      If the client is using Crowdstrike, it is Crowdstrike's responsibility to explain DR from a worst case, and make sure it's tested in the client environment.

      Crowdstrike has caused an unrecoverable error- which is their responsibility to predict.

      If their crappy .sys file causes a BSOD, how does the client machine recover?

      Only the hardware should be capable of preventing recovery.

      This is totally on them.

      1. John_Ericsson

        We have evidence that some organisations can not get to their bit lockers keys because they are on a server that is bitlockered (including backup servers). I would be kind to anyone responsible for this and give them the opportunity to resign,

        1. Mike 125

          So that's like encrypting a key required for recovery with a copy of that key, and then deleting the copy? OK, yes, I was assuming a certain level of common sense competence from the client :)

          My point is that it should be part of Crowdstrike's responsibility to the client to consider what happens if/when their .sys causes a kernel level exception.

          1. wolfetone Silver badge

            You're right about it being Clowdstrike's responsibility to tell the client how there could be an issue with .sys files.

            But on the admin side, if you have devices on your network that use Bitlocker you need to consider how you would get those keys. In a perfect storm scenario, you have to consider that while normally your laptops (as an example) can be unlocked using a key from the server, what if the server is in a burning building? What if it's been stolen? What if the TPM chip on a server* breaks? How do you get your key then.

            I think for many they either didn't consider the server being offline or that it would always be available.

            *I don't know if a server would have a TPM chip, but I've had a laptop that was borked by a broken TPM chip and needing to unlock the Bitlocker on the drive.

            1. Mike 125

              > How do you get your key then.

              Yep, circular dependency.

              This is worth a read. Never mind the IT food chain itself- see how every working environment is now forced into being online. It's taking the lives of hard working people apart.

              https://news.ycombinator.com/item?id=41007898

              1. Fred Daggy Silver badge
                Pint

                Murphy ...

                Repeat after me, you do not bitlocker your MBAM server.

                You have as many virtual Domain Controllers as needed, and always at least one physical (perhaps 2). You do not bitlocker this Domain Controller. This domain controller is behind YOUR lock and key and beyond the reach of any PFY.

                You have 3 or 5 MSX servers but at least 2 consoles with the tools on them

                Your CRL list must be accessible by HTTP and not just HTTPS

                Do not store the backups on the same premises as the resource it secures

                Untested restores are just well stored and secured entropy

                Do not let your 2 most senior admins travel in the same vehicle

                2 part forest administrator password should be in safe/bank vault, twice but together!

                Murphy is not to be fecked with.

            2. collinsl Silver badge

              Most servers sold now have integrated TPM2.0 chips - you'd mostly use those at the "bare metal" OS level though. If you want to put a virtual TPM on a virtual machine (I.E. if you're running VMWare or Hyper-V or KVM or whatever on that physical server) you can do that without having a physical TPM - I've done it at home on my KVM rig and the "TPM" is just a virtual hardware device with the keys etc stored as files on the hard drive.

        2. Richard 12 Silver badge

          More likely, they have the keys in multiple bitlockered servers, perhaps even spread around the world, on the assumption that at least one of them would survive an untoward event.

          Three copies, all able to unlock the other two sounds reasonable on the surface, does it not?

          1. Doctor Syntax Silver badge

            Until you realise that the locks are all similar and all prone to the same remote updates. Then the three become one an that's a single point of failure.

          2. katrinab Silver badge
            Megaphone

            Yes, if the other two servers are running Linux and FreeBSD, or maybe two different Linux distros.

            On FreeBSD it is super easy to get a zfs pool back online on completely different hardware. The software side of things took me about 10 minutes last time I did it. The hardware stuff took a bit longer. I would imagine Linux takes a similar amount of time if it is a distro you know inside out.

        3. Anonymous Coward
          Anonymous Coward

          That kind of stupidity is *bad* !!!

          Simply thinking through the downsides of such a design should have stopped this before it was implemented !!!

          Trust nothing, Hardware, Software, Physical security etc etc

          IT is not magic and flaws in design and thinking are the reason the cybercrime exists ... Try to think like a Cybercriminal and how you would break through your security ... if you cannot do this employ someone who can and will put *their* reputation on the line to back it up !!!

          :)

      2. Anonymous Coward
        Anonymous Coward

        Nope !!!

        You bought the Crowdstrike 'cool aid' and cannot not abdicate any responsibility !!!

        No matter what a 3rd party promises, it is your responsibility to ensure it is real and works when the 'brown stuff' hits the fan !!!

        This does present a possible *new* and valid scenario to plan and test for in the future .... if any good can come out of this total mess !!!

        :)

    4. Jou (Mxyzptlk) Silver badge

      What I don't get: Did they Bitlocker their AD Domain Controllers too? That is a recipe for disaster...

      1. hoola Silver badge

        From what I see the crass stupidity of Infosec teams has no limits.

        They will follow a rabbit Warren of security risks and mitigation to protect systems and data to the point that the very thing they are using to protect something requires a key component in what is being protected to work.

        How about using Azure KeyVault to encrypt all your backups in Azure in case you lose your Azure tenant? Or storing said backups in the same tenant?

        I see it all the time, when you ask questions about how they can recover from this "oh, it is in the cloud, it is safe".

        The same for AWS and Google keys.

        1. Doctor Syntax Silver badge

          The most important questions you can ask of any form of administration start with the words "What if...?" Unfortunately asking such questions is perceived as "being negative" or the like. If asked such a question and you can't answer it, try to find the answer; it might be important.

      2. This post has been deleted by its author

        1. fnusnu

          The CIA triad has nothing to do with security.

    5. Denarius Silver badge

      not been around long enough ?

      @wolftone: Manglement do understand one thing. "Those who control the past, control the future". This current event will be swept from corporate memory, if it ever gets there, be "redefined as IT admin failure at best and forgotten, suppressed, distorted out of recognition. Nothing will change. An IT monoculture will be even more enforced to allow simplified, centralised control. Then the real outage will occur, for which this was the dry run according to my suspicions. Who benefits from seeing outage happen on this scale ?

  4. Niek Jongerius
    Mushroom

    Who, me?

    Time for a new installment on Monday!

    1. cyberdemon Silver badge
      Devil

      Re: Who, me?

      Who will take the blame? The overworked junior tech who pushed a 2am update to a threat definition file without testing it, or the senior developer who failed to implement proper memory-safety and input-validation in a kernel-mode rootkit security module?

      1. Red Ted
        Stop

        Re: Who, me?

        I think you'll find it's that "rouge engineer" who previously worked for VW diesel emissions control department and then at Boeing developing the MCAS.

        1. iron

          Re: Who, me?

          You're being anti red people, you racist.

          (I think you mean rogue engineer.)

          1. gv

            Re: Who, me?

            I thought he was implying the engineer had a Keith Floyd-esque liking for a particular grape-based beverage?

            1. John Brown (no body) Silver badge

              Re: Who, me?

              No, he was just pointing out that many people posting here in the past re "rogue" engineers have mispelled it as "rouge". It's almost a meme.

              1. Red Ted
                Go

                Re: Who, me?

                Thank you so much for your confidence in me, but it was a typo!

                On reflection it does work rather well though, “reds under the bed” and all that!

          2. xyz Silver badge

            Re: Who, me?

            PMSL

        2. Solviva

          Re: Who, me?

          And after being sacrificed at Boeing, given the golden shower (OK can't remember what the correct giolden thing is for this) with a job at Spirit AeroSystems as QC manager.

      2. Anonymous Coward
        Anonymous Coward

        Re: Who, me?

        The LAST person who should be blamed is any techie who pushed the big 'Go' button. There should be a zillion fail safes before it gets to that step, and it is the management/execs who are responsible for making sure those are in place. The engineer who pushes the release button (metaphorically) should not even have to know those upstream fail safes even exist. Those fail safes should be tested and audited and reported on. So one of three things was happening;

        1. They had no fail safes and the engineers lied about this to management. However, management should have ensured proof of testing, etc. So the engineers either lied big time and falsified test results or the management took a "don't ask" approach.

        2. Management knew testing was lacking or absent but ignored this

        3. Management were so dumb that they didn't even think testing was required

        So, in summary, unless we assume the front li e engineers spent more time hiding fundamental failings of process than actually doing their real job (which I seriously doubt) then the blame lies with management and the culture they have built the the company. Oh yeah, and you can include MS in that as they either knew about this or just plainly ignored something fundamental to their business and CUSTOMERS!!

        1. cyberdemon Silver badge
          Devil

          > The LAST person who should be blamed is

          The first person who WILL be blamed. Cynicism is bred from bitter experience

          Clearly CrowdStrike believed that any update to a mere data file would be safe, and didn't bother to enforce any testing on them, perhaps believing that it was better to update them quickly to address new threats rather than delay their release due to testing. Personally I think this is a secondary problem compared to the apparent fact that they had never tested a corrupted data file against their system-critical kernel module..

          For the kernel module to ingest a bad file and cause a BSOD, it would have to: a) not bother to fully validate the file before ingesting it, AND EITHER b) contain a memory-corruption or similar bug that causes a BSOD when processing a bad file OR c) very poor error-handling such that when a bad file is encountered it BSODs instead of simply logging the issue and rejecting the file

          1. Anonymous Coward
            Anonymous Coward

            Re: > The LAST person who should be blamed is

            Yes, completely agree that we know who will be blamed and it won't be the execs!

            As for testing, and the impact to getting updates out quickly... IMO (and experience), there is ABSOLUTELY NO EXCUSE for not having automated testing that validates deployment and basic function as a minimum, and this should not cause any meaningful delay to getting updates out. They clearly do not have such a thing or it is broken big time.

            1. cje

              Re: > The LAST person who should be blamed is

              If ClownStrike still even exists this time next week. There are going to be a lot of customers demanding compensation.

          2. Doctor Syntax Silver badge

            Re: > The LAST person who should be blamed is

            "perhaps believing that it was better to update them quickly to address new threats rather than delay their release due to testing"

            And this file that was so urgently required as to have to be released without testing can, as a workaround, be simply deleted without waiting for a replacement.

            1. Ianab

              Re: > The LAST person who should be blamed is

              My understanding is that as it's a routinely updated file, when the system rebooted the software would automatically check for a newer version, download the non-borked version, and the system would be fully functional again after a minute or so? Obviously it needs a more serious software update to fix the underlying bug and stop this happening again, but you can wait a few days for that (while testing it properly)

              1. collinsl Silver badge

                Re: > The LAST person who should be blamed is

                If it 1) finishes booting or b) can get the file before it bluescreens again, neither of which are guaranteed by the looks of it.

        2. O'Reg Inalsin

          Re: Who, me?

          4. The engineers responsible told Management that testing was necessary, but Management insisted on meeting an arbitrary schedule.

          1. Lomax

            Re: Who, me?

            ...but Management insisted on meeting an arbitrary schedule "performance target". To use correct management speak.

            After all, their fat bonuses depend on it.

        3. collinsl Silver badge

          Re: Who, me?

          Or 4) testing was skipped by corporate decision to cut cost and time spent with no obvious return out of the business. Like M$ have done with their QA team.

      3. Mr Dogshit

        Re: Who, me?

        It wasn't a threat definition file.

        1. cyberdemon Silver badge
          Go

          Re: It wasn't a threat definition file.

          No?

          What exactly was it, then? Do enlighten us, Mr Dogshit :)

          1. Spazturtle Silver badge

            Re: It wasn't a threat definition file.

            They have said that it was a content update, so something like a UI change.

      4. qwerty360

        Re: Who, me?

        Well we know it won't be the senior manger who refused to pay the massive engineering cost for a deployment pipeline designed to automatically detect stuff like this before it took out all the customers...

    2. FirstTangoInParis Bronze badge

      Re: Who, me?

      So I suspect the real problem here is how to contain kernel-level processes when they go rogue. This isn’t limited to Windows; I tried installing AMD video drivers on Ubuntu and it made such a mess I couldn’t reverse it and had to do a clean re install.

      Yes CS should take some responsibility for not parsing the junk update and rolling back to a known good one while flagging the issue to central, but better minds than mine need to look at how to protect against this.

      While we’re here …. This is the result of using Windows where it doesn’t belong. It’s a desktop and server OS, not a web kiosk and not a IoT thing. There are far better distros for that with small footprint and smaller attack surface (my favourite is Porteus Kiosk but others are available) but of course nobody wants to go there. Maybe they will now.

      1. martinusher Silver badge

        Re: Who, me?

        You roll out the update to a test sample,run with it for a bit and then OK it for general release. This is SOP for mainframe applications from the dawn of time, back when 'failure was not an option'.

        1. hoola Silver badge

          Re: Who, me?

          That is not how CrowdStrike and similar solutions work.

          1. Anonymous Coward
            Anonymous Coward

            Re: Who, me?

            "That is not how CrowdStrike and similar solutions work."

            But, now as a consequence *we* all know how Crowdstrike can be made *not* to work and how Crowdstrike will continue to work if you delete a *needed* *.sys file.

            Crowdstrike have not thought that they could produce a flawed *.sys file and the use/error handling of that file *obviously* does not play well with windows !!!

            Many changes are needed to address this !!!

            :)

  5. Anonymous Coward
    Anonymous Coward

    Sports Sponsorship

    I am always wary of vendors with high profile high cost sports sponsorship, all that profit comes from a product that could either have more spent on R&D or be significantly less expensive. Of course the C suite who go to the F1, Golf, Soccer, Olympics whatever then associate these brands (Darktrace is another...) with success, not understanding that (1) no EDR/MDR solution means you'll be 100% safe and (2) cost is not quality.

    1. cyberdemon Silver badge
      Pint

      Re: Sports Sponsorship

      https://www.bbc.co.uk/news/articles/cn4vgq5150qo

      That has to be nominated for IT photo of the year....

      (look carefully at the screens in the background)

      1. DJV Silver badge

        Re: Sports Sponsorship

        I like the way that article states: "Crowdstrike is one of the biggest and most trusted brands in cyber-security."

        I suspect the "is" has now changed to a "was".

        1. Jellied Eel Silver badge

          Re: Sports Sponsorship

          I suspect the "is" has now changed to a "was".

          Apparently ClownsTrike's Kernel Kurtz is an avid racing driver. So move fast, break things.

      2. KittenHuffer Silver badge

        Re: Sports Sponsorship

        I remember the quip from a comedian (think it was Mike harding) about seeing a photo in a magazine of a Durex sponsored racing car with a puncture. He was laughing his head off but the local Auzzies didn't get the joke. Turned out that Durex was the generic name for cellotape in Australia!

        1. Pomgolian

          Re: Sports Sponsorship

          It was Jasper Carrott IIRC

          1. Vincent van Gopher

            Re: Sports Sponsorship

            Carrot did a Durex joke but it was suggested to him in the joke sketch to (in Aussie accent) go get a roll of Durex - 'A roll!' was his response in the joke.

            Not sure who did the F1 car though.

            1. John Miles

              Re: Sports Sponsorship

              Pretty certain it was Carrot - it goes something like He sees the picture of F1 car sponsored by Durex with puncture on trip down under, finds the Australians don't find it funny (not a titter) because the brand is tape over there. Then he goes on to talk about an if an English overhears an Aussie asking for a roll of Durex, giant size - I think it finishes with him wanting to hear Aussie visiting a UK shop.

        2. Anonymous Coward
          Anonymous Coward

          Re: Sports Sponsorship

          Durex? Thats a very old reference. These days this brand is *firmly* associated with rubber Johnnie’s in Aus.

          You’re after “Sellotape” for the generic name for sticky tape. Less so: “Scotch tape”

        3. Benegesserict Cumbersomberbatch Silver badge

          Re: Sports Sponsorship

          Ansell being an Australian company, it was hard for competitors to get market penetration. But then, I believe that's the idea.

        4. nonpc

          Re: Sports Sponsorship

          I remember seeing that at Brands Hatch in the 70s. On asking what the connection was between car racing and Durex, I was told it was the heat, the excitement and the smell of burnt rubber...

    2. Gort99
      Thumb Down

      Re: Sports Sponsorship

      Funny you should say that. I remember many moons ago going to a corporate event at Autonomy's HQ shortly after they started their high profile sponsorship of the Mercedes F1 team. Whatever happened to them?

      1. Anonymous Coward
        Anonymous Coward

        Re: Sports Sponsorship

        Taken over by HPE Software, spun out and merged with Micro Focus and recently purchased by Opentext..

  6. GoneFission

    I'm sure some grossly overpaid "consultants" are already frantically trying to spin this as an argument against remote work. There are PowerPoint charts being drafted labeled "500% faster return-to-productivity after global Crowdstrike outage for in-person staff due to valuable centralized office attendance!" with stock image photos of men in suits sitting in a boardroom.

    In reality it's lines of idle in-office workers crowding the IT department in a disordered line, each asking "is it fixed yet? Can we go home until it is?"

    1. cyberdemon Silver badge
      Pint

      It's Friday.

      Pub until it's fixed.. Oh look at the time, 5pm already

      1. Don Bannister

        It's always 5pm somewhere :-)

    2. Doctor Syntax Silver badge

      In reality it'll be manglement demanding their PCs are fixed first. Those using PCs to actually earn revenue will have to wait.

    3. sw guy

      But...

      I was able to work at home while there was a mess in the office

      (and yes, this is by luck)

  7. 502 bad gateway
    IT Angle

    Epic fail

    It's almost like no one did any testing, or even considered how the clients manage their systems. Fewer polyester suits and more testers perhaps.

  8. gryphon

    Despite Microsoft's pushing of Defender as the answer to everything is it not best practise still to have different Anti-virus / anti-malware products on servers and clients.

  9. Doctor Syntax Silver badge

    Is it too much to hope that when the dust settles legislators will start requiring that major infrastructure failures will, by statute, be followed up by an inquiry to determine what led up to the incident; decisions were made which impacted release of faulty S/W & so on. Bad decisions, especially those undocumented or made to cut costs would then lead to prosecution of those who made them.

    1. Will Godfrey Silver badge
      Unhappy

      Actually yes. it's far too much to ask. It requires the use of commonsense. Something on the critical red list of endangered.

  10. Anonymous Coward
    Anonymous Coward

    SECURE NO BOOT

    Yup.

  11. smudge
    Windows

    Good news for those admins

    The Chief Exec of Crowdstrike has said: "customers 'remain fully protected'".

    Well yes - a bricked Windows system is pretty damn secure :)

    1. jfw25

      Re: Good news for those admins

      Did he offer any recommendations about whether it is better to protect customers by pushing them downstairs, or shoving?

  12. Rich 2 Silver badge

    Modern life

    “We can't boot into safe mode because our BitLocker keys are stored inside of a service that we can't login to because our AD is down”

    Doesn’t this just sum up the ridiculous overly-complex intertwined mess that modern systems have turned into.

    We used to have plain text files, and simple login procedures. Now we have 2FA and you need to talk to some remote server using some excruciatingly complex chain of certificates and crazy protocols just to power your local machine on!

    I know some of this stuff is (in theory) useful but it’s just too fragile and outrageously complex

    1. Bruce Ordway

      Re: Modern life

      This is why I haven't ever used BitLocker.

      Sure, this practice might end up biting me some day but... not today.

      1. Paul Crawford Silver badge

        Re: Modern life

        Linux user here but same sentiment: my mobile devices that are likely to get lost/stolen have encrypted disks, my rack-mount kit that is far less likely, and usually needs to reboot automatically, is not using such boot-level restrictions & encryption.

        While MS & Crowdstrike are the obvious and justifiable whipping boys here on multiple levels, there is a major aspect of general resilience to be considered that is independent of them on how to recover from an IT disaster of any sort (screw-up, attack, or just natural disaster). So many have a "hope it won't happen" plan.

    2. nonpc

      Re: Modern life

      The IT version of the physical problem when the fire safe survives the fire but the key to the fire safe had melted...

  13. Jou (Mxyzptlk) Silver badge

    I never heard of CrowdStrike

    until today. Not a good impression.

    1. Solviva

      Re: I never heard of CrowdStrike

      Ditto despite having just realised they sponsor Mercedes F1, having watched every session for I don't know how long, it seems I never registered the perhaps odd name "CrowdStrike" on their livery - is that what ISIS call a suicide bomber's job?

      1. jdelarunz

        Re: I never heard of CrowdStrike

        I thought that "crowd strike" referred to the 1950s Mercedes F1 team...

        1. diodesign (Written by Reg staff) Silver badge

          Mercedes

          Ironic seeing as the Mercedes F1 team is sponsored by CrowdStrike, uses its tech on their Windows boxen, and has been hit by the SNAFU ahead of the Hungarian GP.

          C.

          1. Yorick Hunt Silver badge
            Trollface

            Re: Mercedes

            What you call irony, I call poetic justice.

    2. Anonymous Coward
      Anonymous Coward

      Re: I never heard of CrowdStrike

      I likewise had never previously heard of this CrowdStrike company (I don't sully myself with Windows stuff much), but, in that way that your brain tends to do word association when you hear or see a new name/word, the first thought that came to mind was "Is that sort of like flystrike?"

  14. martinusher Silver badge

    The elephant in the room

    Where is it engraved in stone that outside companies can reach into your computer and silently alter its software without either telling you or asking permission?

    The fundamental problem is an unstable operating environment that has to be repeatedly patched because of FUD -- its the ideal self-sustaining ecosystem. If it was designed properly in the first place then there would be no need for constant updating. Sure, it would put a lot of people out of work but then, seriously, what are they doing that's productive?

    Its worth noting that countries which can't or won't be served with constant updates due to sanctions and the like -- places like Russia and China -- seem to be unaffected by this problem.

    1. Cav Bronze badge

      Re: The elephant in the room

      "Where is it engraved in stone that outside companies can reach into your computer and silently alter its software". The conditions you agreed to when you chose to install the software.

      Most AV software gives you the option to automatically install updates; you aren't forced to do so.

      1. Jellied Eel Silver badge

        Re: The elephant in the room

        Most AV software gives you the option to automatically install updates; you aren't forced to do so.

        Sometimes you can be. Like it's considered 'best practice' to keep software patched and up to date. It's also often specified in customer bids that the bidder will do this. Then when you add costs for a test environment and staff costs, they say that's too expensive and want it taken out. Which is doable (under duress), if you insert clauses stating you are not liable for any loss or damages due to the lack of an adequate test environment. Which sales will then object to as being 'too negative'.

        Then there's insurance. Hopefully everyone that's bid ClownStrike has checked their liability insurance. Especially corporate officers and their DOI policies. This is shaping up to being a very expensive mistake, and insurers don't like paying out. So there can be a bit of a Catch-22. Don't apply updates right away, get hacked, and no payout because the company failed to secure their systems. Apply the patch and stuff breaks, no payout because the company should have tested it first. This may be Clownstrike's problem because obviously they pushed an update without properly testing it first.

        But it's moments like this that make me glad I'm mostly retired.

        1. Anonymous Coward
          Anonymous Coward

          Re: The elephant in the room

          The elephant in the room is that this is a kernel mode driver - which is why the blast radius is huge and recovery options limited. Ironically, besides the risk of shipping come that crashes is the risk of shipping a security vulnerability.

          Apple has very sensibly killed off 3rd party kernel-mode drivers, and products for MacOS are presumably not implemented with them.

        2. Anonymous Coward
          Anonymous Coward

          Re: The elephant in the room

          Had a vendor tell us today (unrelated to the CrowdStrike debacle) that they needed to come delete some soon-to-be-obsolete third-party software off our servers because the company that uses them like a marionette said they had to. Given that (1) we bought a perpetual license to that software, so we can run it as long as we like even if it won't have support in the future, and (2) the puppetmaster company is now offering replacement software that is notably inferior, not really fully ready for production use, and expensive, it's a wonder the sysadmin's response was printable. (It was a surprisingly polite "uh, no, I don't think so.")

          So yes, sometimes the vendor tries to MAKE you do what's best for them rather than for you.

          1. Anonymous Coward
            Anonymous Coward

            Re: The elephant in the room

            "So yes, sometimes always the vendor tries to MAKE you do what's best for them rather than for you."

            Give it a few more years and you will understand vendors much much better !!!

            :)

  15. TekGuruNull

    Crowdstrike- Oh, the irony

    Follow the crowd and, voila, you're pwned. Just one more example of why you don't: "Just look at what everyone else uses. We'll go with that."

  16. Nick Gisburne
    FAIL

    One thing takes down the world?

    If one thing can take down the world, or a significant part of it, we've now discovered that its foundations were not as solid as they seemed.

    Something is very, very wrong if a single, simple update can paralyse so many of the systems we rely upon.

    Fixing this particular problem won't fix THE problem - there are vulnerabilities built into everything, and it doesn't need a hacker group to find them, it just needs the system builders to f*ck something up.

    If nothing changes, something like this will happen again.

    1. ecofeco Silver badge
      Headmaster

      Re: One thing takes down the world?

      Some people just discovered it. Some of us have known for a very long time.

      But one thing is for sure, no lesson will be learned.

  17. Anonymous Coward
    Anonymous Coward

    Hello, customer services CrowdStrike

    What the… perhaps some large accounts will be reviewing their subscription after this cluster drop.

  18. Anonymous Coward
    Anonymous Coward

    Hello, OS vendors...

    ...if it can't boot. Switch to a safe mode where you can at least do network based updates.

    1. Anonymous Coward
      Anonymous Coward

      Re: Hello, OS vendors...

      On two different Dell laptops today, we discovered we CAN'T boot to safe mode. Windows won't boot, so can't use that to initiate safe mode. (Terrible design decision there.) The Dell boot options simply don't give us the opportunity (just hardware diagnostics, which pass fine, regular boot, or full-recovery-mode to wipe and reinstall). Even turning off the machine during Windows startup, twice, doesn't work - instead of going to the Windows boot options, it goes back to the Dell ones.

      Apparently the solution is to boot to some other media, then use that to delete the offending file.

    2. ecofeco Silver badge
      FAIL

      Re: Hello, OS vendors...

      Safe Mode (F8, remember that?) on PCs went away with Win 10.

      Beginning with Win 10 you had to boot to full Windows to... tell it to boot to Safe Mode on restart.

      Fucking brilliant, that was.

  19. UselessEustace

    Ms Fnd in a Lbry

    It's not as if we weren't warned:

    https://en.wikipedia.org/wiki/MS_Fnd_in_a_Lbry

    Hal Draper 1961

  20. MattPDev

    Why not...

    ... just immediately execute your disaster recovery plan that you surely developed and tested when you designed your system around these products.

    What do you mean you didn't think this could happen?

    1. ecofeco Silver badge

      Re: Why not...

      The DR was rightsized and those who created it were also rightsized.

  21. Claptrap314 Silver badge
    WTF?

    Failures all around

    Okay, let's see...

    1) How is it, after the AWS S3 config fiasco, that companies are allowing circular dependencies in their system recovery process?

    2) How is it, ever, that admins of large fleets of machines are allowing all-or-nothing updates from ANY source?

    3) How is it, that an AV provider, as a virtual admin of perhaps millions of machines, is pushing updates to all devices at the same time? I get that you don't want to advertise to the bad guys, but we're still safe if the rollout were smeared out over a couple of hours. Of course, this assumes that canaries are being used to stop a rollout if a significant number of machines are borking.

    Given how AV/DLP has to work, and our lack of knowledge about exactly what went wrong, I'm hard pressed to fault anyone for pushing out an update that borks some percentage of machines. But the fact that so many machines were borked--that's really my question (for CS). But at the same time, I have extremely little sympathy for someone who is managing 100k machines but not bothered to wargame this scenario. I mean seriously, when was the last 12 month period that a Microsoft OS update didn't bork a lot of machines?

    1. ecofeco Silver badge

      Re: Failures all around

      Ask the CxOs.

      IT has no control over the good ol boys networks.

  22. John.B

    New starter

    I heard it was Liz Truss's first day at CrowdStrike...

  23. frankyunderwood123

    Astounding lack of finger pointing at Microsoft - the real news story

    It seems that nobody in the wider media is apportioning a lot of blame on Microsoft.

    They've been, obviously, very quiet about all of this. The media focus is all on CrowdStrike.

    It should be noted that falcon hasn't impacted macOS or Linux users.

    The corporate I work for uses Crowdstrike across all operating systems - my work issued mac has falcond running.

    I can still use my mac to do my day job.

    The fact that Windows BSOD's due to a third party service failure is the real news story here.

    Who the hell thinks it's a good idea to NOT bother to code in a failsafe scenario to cover a third party AV service provider failure?

    It's coding 101 - or it should be.

    One of the defining laws of software is Fail Gracefully.

    A complete failure of the OS to boot and for the workaround to be manual intervention on a machine by machine basis?

    That beggars belief.

    Why are so few people mentioning this glaringly obvious issue?

    1. katrinab Silver badge
      Windows

      Re: Astounding lack of finger pointing at Microsoft - the real news story

      Huh

      Most of the headlines this morning were flat-out blaming Microsoft and didn't mention Crowdstrike at all.

    2. Anonymous Coward
      Anonymous Coward

      Re: Astounding lack of finger pointing at Microsoft - the real news story

      Oh, there's a simple workaround - boot to safe mode and fix it. Except to get to safe mode these days, you have to click things inside the running copy of Windows...

      1. cyberdemon Silver badge

        Re: Astounding lack of finger pointing at Microsoft - the real news story

        Does repeatedly whacking F8 like a maniac during boot no longer work?

        Been a long time since I last used/broke Windows

        1. collinsl Silver badge

          Re: Astounding lack of finger pointing at Microsoft - the real news story

          Nope, that went away with Vista IIRC (definitely gone by W8). Now you have to wait for Windows to "notice" it's not booted properly (handled as part of UEFI boot) several times (default is 3 IIRC) or reboot holding down a key (either CTRL or shift, can't remember which) and it'll take you to the boot menu next time it boots up.

  24. xyz Silver badge

    I'm lovin', the talking heads on the news...

    But, but, but my phone won't do anything when I press this and Ryanair customer service won't like talk to me and now they're saying they've got to do stuff on paper, but there's 2 queues.... I'm just trying to go on holiday. OMG end o' the world.

  25. WindyRidge

    FRIDAY: We must review our disaster recovery plan, backup bitlocker keys, delay auto updates...

    MONDAY: Last week is another country.

  26. tip pc Silver badge
    Coat

    immutable systems

    why does windows not have an immutable OS partition and a security partition for these things?

    absolute pain

    finding the borked systems, attaching the boot medium, then getting to the cli to run the commands. repeat thousands of times.

    will management appreciate us more now they see the impact of not getting things done right and dedication of teams to correct stuff?

    Done my bit now, off to the weekend!!

    1. Adair Silver badge

      Re: immutable systems

      That's my take also. This whole farcical cockup highlights a fundamental weakness in Windows as a frontline OS. It also highlights the potential value of running an immutable OS where IT armageddons can be handled gracefully and heavily mitigated against.

      1. tip pc Silver badge

        Re: immutable systems

        yep, should only really care about the app config & data

        everything else should be commodity & be able to be blown away and generically rebuilt on the fly if needed.

  27. Anonymous Coward
    Anonymous Coward

    "It is very disturbing that a single AV update can take down more machines than a global denial of service attack."

    It's very disturbing that any IT department thinks it's a good idea to rely on mindlessly pushing out updates for a system which creates a single point of failure for large sections of the economy.

  28. may_i Silver badge

    And yet

    Despite screwing over countless customers and bringing half of the global economy to a grinding halt,

    CrowdStrike shares have dropped just 12% at this moment.

    Clearly investors don't seem to care much.

    1. cyberdemon Silver badge
      Devil

      Re: And yet

      It's Friday, and it's the latter-half of July, nearly August.

      Investors are far too busy snorting cocaine and cavorting with hookers on their private yachts to notice that they might be slightly out of pocket next week.

      1. Androgynous Cupboard Silver badge

        Re: And yet

        Or perhaps all the traders being asked to sell stock are unable to do so because their computers are down…

    2. spireite Silver badge
      Mushroom

      Re: And yet

      The longer recovery takes for customers, irrespective of a fix being made available today, is what is going to impact on CS most in terms of 'goodwill' and share price.

      I imagine many will just yank that software out, period...

  29. J.G.Harston Silver badge

    When I woke up to blearily hearing the radio saying "CrowdStrike" had taken out loads of IT systems, I naturally concluded that CloudStrike was today's DoS virus attack, and was primed to expect legions of IT engineers to delete, destroy, exterminate, eliminate CloudStrike.

    Now it turns out to be an actual systems product. WTF names their software as though it is malicious software?

    1. cjcox

      "WTF names their software as though it is malicious software?"

      Have no idea. - Darktrace

    2. ecofeco Silver badge

      Exact same thing I though at first.

  30. Lomax
    Angel

    Disk encryption on servers?

    I must have been missing something; why would you use Bitlocker encryption on a server that is already (one would hope) physically secure? I encrypt the disks in my laptops with LUKS, since being laptops they might end up being left on the tube after a wet night out, and desktops because burglary etc, but I have never considered that encrypting my server disks might also be important. I've mainly regarded disk encryption to be valuable where unauthorised physical access might be an issue, but perhaps I need to think again? In my (admittedly limited) view the main threat to a server comes via its network connection, and possible vulnerabilities to network originated attacks in the software running on them - but from that end the disks will already be decrypted anyway, right?

    1. Lomax
      Black Helicopters

      Re: Disk encryption on servers?

      I guess if you're worried about what the law might find on your servers in the event of a legal... situation... then perhaps it makes sense. But in that case won't you be required to hand over the keys anyway? I wouldn't know, because I'm not involved in any potentially criminal activity. Are they?

      1. collinsl Silver badge

        Re: Disk encryption on servers?

        UK law requires you to divulge passwords, PINs etc if required to by the police. Not sure about biometrics, think it would need some precedents set to determine that one.

    2. HereIAmJH Silver badge

      Re: Disk encryption on servers?

      I must have been missing something; why would you use Bitlocker encryption on a server that is already (one would hope) physically secure?

      Because so many large companies have been getting their asses handed to them by hackers that have broken into their systems. PCI and their insurance companies are beating them over the head for 'encryption at rest'. And BitLocker is the fastest, cheapest way to get there. Manglement doesn't understand that as soon as the OS boots, the data is unencrypted for anyone that can get logged into the system. To them, all their stored documents and free databases that don't support encryption are now safe.

    3. Tim Kemp

      Re: Disk encryption on servers?

      Sometimes policy requires disk encryption when using equipment you don't own such as public cloud.

      Yet another reason for keeping a copy of data on site.

    4. HoneyMonster

      Re: Disk encryption on servers?

      Other threats mostly revolve around your storage media leaving site. Disk or tape failures* will mean that from time to time drives or cartridges will leave site and you'll have no idea where they could end up. Sure, you could have a policy and process that means crushing or shredding but if the media is encrypted, it is no longer something to worry about and it ticks that all important compliance item on the audit.

      * A failed disk or tape isn't necessarily unreadable - it may well have just exceeded some read/write error threshold.

  31. J.G.Harston Silver badge

    The BBC News bod has just said "This only affects people running Microsoft".

    Microsoft what? Microsoft Word? Microsoft Teams? Microsoft Active Directory? Which Microsoft product?

    It's like saying "this issue only affects Electrolux" AARGHH!!! So my fridge is going to destroy my laundry????

    1. Anonymous Coward
      Anonymous Coward

      Microsoft Windows, and running CrowdStrike Falcon. Or any system that depends on that one being up (like, say, the Win machine being an AD.) Or any system that depends on that one...

  32. John Doe 12

    Microsoft At Fault Here

    So many stupid people posting in the comments here. Yes Crowdstrike messed up BUT the real villain here is Microsoft. If their crappy OS can be toppled by any app running on the Windows platform then questions need to be asked. Yet people want to throw all the blame at Crowdstrike. Seriously cop on everyone!!

    1. ecofeco Silver badge

      Re: Microsoft At Fault Here

      M$ fanbois are so far down the rabbit hole they've met the Lewis Carroll Red Queen.

      1. ecofeco Silver badge

        Re: Microsoft At Fault Here

        ...and hang on her every word.

    2. Peter Mount
      Facepalm

      Re: Microsoft At Fault Here

      Exactly. Through out yesterday I've had people saying I should stop blaming both Microsoft or Microsoft & Crowdstrike because it's all Crowdstrike's fault.

      To me it's both to blame:

      * CS for having the ability to just apply an update automatically & seemingly without any QA.

      * MS for not having that part of the OS being resilient to said updates.

      I now know CS is available for Linux & MacOS but (for Linux) it uses existing functionality to hook into the kernel, so the dodgy update wouldn't cause the system to fail.

      Note: I've never heard of CrowdStrike before yesterday morning (UK time).

    3. Rich 2 Silver badge

      Re: Microsoft At Fault Here

      I’m not defending MS - I look forward to the day when they burn in hell - but my understanding is that the crowdstrike software wedges itself into the boot process; it runs BEFORE windows boots up. So it’s not really window’s fault; the same mechanism would bugger-up ANY OS.

      It is very much MS’s fault for making an OS that is so crappy and full of holes that stuff like Crowdstrike is considered necessary in the first place though.

      1. Jou (Mxyzptlk) Silver badge

        Re: Microsoft At Fault Here

        Actually, such software is needed since "layer 8 problem" is real. This is not really the OS-es fault...

  33. Anonymous Coward
    Anonymous Coward

    We were quite lucky as we don't immediately patch but wait a few days for others to test them.

    There was an element of smugness today over the Crowdstrike strike not affecting directly.

    Unfortunately we send/receive stuff from other partners whose infra was affected. As a result data ingestion fell off a cliff as they couldn't send us.

    After they recover, we'll be hit by barrage of data in the catchup.

    1. PerlyKing

      Re: wait a few days for others to test

      While this is more sensible than just letting it happen, it doesn't sound like a rock-solid security strategy.

  34. cjcox

    Whiners

    You got an apology. What else could anyone want?

  35. W.S.Gosset Silver badge
    Megaphone

    Problem identified: null pointer

    Null pointer, per this chap's inspection of a stack trace dump.

    So it tries to access part of a system driver and Windows throws its wobbly.

    Since this hasn't happened before, this seems to imply that the "channel" file is not simple data but either code, or code-config used to build & run code @ runtime.

    1. Anonymous Coward
      Anonymous Coward

      Re: Problem identified: null pointer

      > .. seems to imply that the "channel" file is not simple data but either code, or code-config used to build & run code @ runtime.

      Interesting .. now explain how a device driver can go corrupt :o

      1. W.S.Gosset Silver badge

        Re: Problem identified: null pointer

        Any number of ways but completely extraneous to this: that's not what happened. The system driver wasn't running to generate the stack trace dump; the CrowdStrike code was running.

        The CrowdStrike code sought to access memory in an OS-protected area; in this case, apparently, a system driver. The OS memory-protection kicked in and shut down the OS.

        The system driver was not the trigger, was not "running" to cause the problem.

        Analogy-rewrite of your question: "He attacked a shopkeeper and the police stepped in. Why did the shopkeeper DO this to us!?!" :D

  36. Anonymous Coward
    Anonymous Coward

    And thus…

    The great cloud testing trial ended poorly.

    Save money, get the customers to do the testing.

  37. Hurn

    Does no one remember the Dos Prompt?

    Official instructions say to "reboot (again) into Advanced mode, and then choose Safe Mode"

    Why bother, when there's a choice for Dos Prompt, RIGHT THERE. (And, if it's an old server your working on with a KVM crash cart, save yourself anywhere from 4 to 7 min rebooting / checking memory, probing for boot drive / starting at the UEFI/BIOS screen, before getting to the menu where you can choose "Safe Mode")

    Instead of all that, Click on DOS Prompt, select Administrator, enter password, and poof! At an X prompt. (Well, as long as you're not using BitLocker.)

    C: or D: or get to whichever partition has windows on it (can do a dir to make sure you're on the right one)

    cd Windows\System32\drivers\CrowdStrike

    del C-00000291*.sys

    and, if there's no error message, you're done.

    Type Exit, then click to start windows.

    Can save up to 10 minutes per older server. And, I've not seen this, anywhere.

    Guess Dos Prompts are too scary these days.

  38. Anonymous Coward
    Anonymous Coward

    Making Windows a SPOF by storing secure keys on it, is impressively stupιd!

    Anyway, maybe now it'll sink in that, if you want reliability or security, Windows is not the answer.

  39. Nematode Bronze badge

    Well, all this has been a useful reminder to once more check my own disaster recovery plans (which basically involve booting off a USB stick running the (non-Microsoft) backup/restore software and restoring the OS partition, where all my data is on a non-C: partition), plus verifying I had disabled Bitlocker (I had), plus telling my rellies to do the same, but also to check the stupid Win 10 removal of Safe Mode bootability which as it's not part of my DR plan I hadn't realised thatbwas a potential issue. All whilst mindfully reminding myself that any trace of smugness often precedes a downfall.

    Paranoid? Moi? Oui! It's the safest way.

    1. Jou (Mxyzptlk) Silver badge

      Well, you CAN get the boot menu option back.

      Administrative CMD or booted from DVD with CMD box:

      bcdedit /set {bootmgr} displaybootmenu yes

      bcdedit /set {bootmgr} timeout 5

      Voila 5 seconds time to press F8.

      But you'd still need you bitlocker recovery key.

  40. ChipsforBreakfast
    WTF?

    Avoidable?

    Having had a little time to look at this (thankfully we aren't huge crowdstrike users and only had a few test machines with it on) it seems to me that Microsoft could have made this far, far less of an issue.

    Why, when the same driver file is repeatedly causing a boot failure (and Windows clearly knows what's causing the failure, it's right there on the BSOD) does their 'automatic repair' process not simply block the driver from loading?

    And where did the old 'last known good' boot option go? Was it perfect, no, but it's a hell of a lot better than rebuilding a whole OS or talking thousands of users through a less than intuitive recovery process.

    I feel that while Microsoft aren't to blame for the outage they certainly could, and should, have made recovering from such an issue far easier.

    1. Jou (Mxyzptlk) Silver badge

      Re: Avoidable?

      So I am not the only one missing HKEY_LOCAL_MACHINE\SYSTEM\ControlSet002 ...

  41. .thalamus

    There is a simpler fix…

    My work laptop was affected by this as I tend to leave it on in standby. So when I got up yesterday it was in recovery and on rebooting it BSODd with csagent.sys as the cause.

    Our IT department were out all over the place fixing servers and business critical systems yesterday, so any user support for this problem just wasn’t going to happen, rightly so.

    Microsoft posted an interesting recommendation (I think it was on their azure blog or forum) to reboot “up to 15 times” which would eventually allow Crowdstrike to pull the fixed update. I was sceptical but tried it - after 8 reboots I got to the login screen and the network connected (which it hadn’t done previously).

    Windows has a mechanism to skip loading drivers that prevent boot after a number of failures and this was seemingly triggered after 8 reboots.

    Trying to login immediately resulted in the same BSOD (I assume the driver was loaded on login too). So I did the 8 reboots again, got to the login screen again with network connectivity and just left it there for half an hour. Then I logged in and all was well.

    So the fix doesn’t necessarily involve hands on - any user can perform 8 reboots and leave it on the login screen for a bit for the fixed update to be pulled. Hopefully this helps someone!

    1. Jou (Mxyzptlk) Silver badge

      Re: There is a simpler fix…

      > Windows has a mechanism to skip loading drivers that prevent boot after a number of failures and this was seemingly triggered after 8 reboots.

      THIS is the most important posting in this thread ! Thank you for that information!

      1. .thalamus

        Re: There is a simpler fix…

        I don’t understand why it isn’t more common knowledge or hasn’t been picked up by the tech media, although that might require some actual journalism instead of melodramatic pontification.

        It wasn’t a coincidence that it suddenly got to the login screen and the network connected after 8 reboots when it had crashed before that consistently - then immediately crashing again at login when the driver loaded. It’s clear that Windows declined to load the driver at boot because of the boot failures. Nothing to do with a “race condition” between receiving the good update and csagent parsing the faulty file.

  42. Anonymous Coward
    Anonymous Coward

    lots of lessons to learn.

    The comments I see popping up are kinda hilarious, all these IT pros who are only now discovering the weaknesses in their DR plans, the most amusing one is the lack of bitlocker recovery planning.

  43. FuzzyTheBear
    FAIL

    Matter of trust ..

    There's an important element here called trust. After this , who will reasonably be trusting CrowdStrike ? This is going to cost companies a hell of a lot of money. Just take the group of travellers that will be claiming compensation for their delays , stays in hotels , restaurant bills etc .. We're talking tens of billions in monetary damages. Who can anyone trust them after this is a big question. Will they take chances and stick with them ? Personally i wouldn't. Too risky. To have a fix will not be enough to recover from the disaster they are responsible for.

    1. Paul Crawford Silver badge

      Re: Matter of trust ..

      Those who still trust MS after many decades of evidence to the contrary?

      Reports will be written, "lessons will be learned", and folks will go back to the same old shit again. Fundamentally there are several issues here, but the dependency on specific vendors will mean the cost & trouble of proper fixes is too much. MS know that, as to AV suppliers.

  44. Jou (Mxyzptlk) Silver badge

    IMHO Microsoft should see this as big fat QA warning...

    Currently the priority is 110% marketing, not quality.

    So they need to beef up their QA.

    And as for the marketing: Why don't they concentrate on actually good features? Deduplication (since Server 2012, working very well), SMB compression, robocopy /iorate, shadowcopies (with better defaults for client OS than now) and so on. But no, paint gets layers and 3D, but so half done noone actually uses it.

  45. greenbaffin

    Butterfly causes hurricane in the digital economy

    Imagine a world where a single software update is the butterfly that flaps its wings and causes a hurricane in the digital economy. Hospitals go on a coffee break, banks play hide and seek, airports take an unexpected flight, and businesses everywhere hit the 'pause' button. And in the midst of this chaos, someone suggests a digital currency? That's like bringing a virtual knife to a real gunfight. Long live the king – cold, hard cash!

  46. Anonymous Coward
    Anonymous Coward

    CrowdStrike is not worth 83 Billion Dollars

    Thesis: Crowdstrike is not worth 93 billion dollars (at time of writing).

    Fear: ‘CrowdStrike is an enterprise-grade employee spying app masquerading as a cloud application observability dashboard.’

    1. Anonymous Coward
      Anonymous Coward

      Re: CrowdStrike is not worth 83 Billion Dollars

      How else are we supposed to see what the CEO types as he's typing it?

  47. cosymart
    Coffee/keyboard

    Pot and Kettle

    Quote from Microsoft "says the incident highlights how important it is for companies such as CrowdStrike to use quality control checks on updates before sending them out.It’s also a reminder of how important it is for all of us across the tech ecosystem to prioritize operating with safe deployment and disaster recovery using the mechanisms that exist,” I needed a new keyboard after reading this.

  48. Flywheel
    Thumb Up

    The right people affected for a change!

    most execs' laptops are in infinite bsod boot loops

    Great! So, chances are that something will be done to stop this happening again - maybe a more diverse portfolio of systems? Even better if it included bean counters.

  49. Plingboots

    I am sure this could be fixed using pxe and deployment software and some communication.

    This would of been all over by lunch time, people just cashing in.

  50. M.

    More Like "CrowdStroke"

    We've taken to labeling this event "CrowdStroke"

    It has less distasteful connotations, and seems more to the point of what actually happened - computers went "brain-dead"

    Cheers!

  51. theguy44

    Test, test, test again

    Crowdstrike got popular cos it was sold to higher level exec people who had no idea..

    That multiplied, like with anything IT the more that have it, the more higher up people want it.

    (proper IT Crowd fodder)

    One borked update taking down so much stuff (a whole bunch of the 8.5m would be servers, which might be serving hundreds/thousands of people, who aren't directly locally affected), is very much not good. But the specifics of removing it is a bag of wtf for IT admins.

    How did this not get found during any stage of testing? As an IT person, I would hope something that is potentially going to affect your entire customer base (of lots of large key customers), should have the s**t tested out of it..

  52. Stuart Castle Silver badge

    Re "Talking a warehouse operator through the intricacies of BitLocker recovery keys and command prompts is not for the faint-hearted!"

    It's also not great having to give said warehouse operator access to an account with admin rights on the machine. That said, when the machine is back online and talking to whatever management system you have, you can change it.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like