back to article EU gave CrowdStrike the keys to the Windows kernel, claims Microsoft

Did the EU force Microsoft to let third parties like CrowdStrike run riot in the Windows kernel as a result of a 2009 undertaking? This is the implication being peddled by the Redmond-based cloud and software titan. As the tech industry deals with the fallout from the CrowdStrike incident, Microsoft is facing questions. Why is …

  1. ComputerSays_noAbsolutelyNo Silver badge

    Dave Plummer has a different take on this

    CrowdStrike IT Outage Explained by a Windows Developer

    https://www.youtube.com/watch?v=wAzEJxOo1ts

    1. tony72

      Re: Dave Plummer has a different take on this

      For those unwilling or unable to youtube;

      In the YouTube video titled "CrowdStrike IT Outage Explained by a Windows Developer," retired software engineer and Windows developer Dave explains the cause of the recent CrowdStrike IT outage. The issue was due to a bad update to CrowdStrike software that resulted in blue screens on various machines worldwide. Dave discusses the significance of CrowdStrike being on machines in the first place and the consequences of a kernel driver failure. He also shares his experience as a Microsoft developer in the 1990s and the importance of understanding the differences between kernel mode and user mode. The speaker then delves into the concept of kernel mode and user mode, explaining that only a few things, such as thread scheduling, Heap manager, and device drivers, run in kernel mode due to its access to hardware. CrowdStrike's Falcon security product requires kernel-level access to function effectively, and writing a device driver for Falcon allows it to reside in kernel mode and access system data structures and services. However, the use of dynamic definition files instead of Microsoft's WHQL certification for drivers could potentially contain unsigned and unknown code that runs in kernel mode, posing a security risk. The precise cause of the IT outage was a null pointer issue in a dynamic data file downloaded as a Cy file, which contained only zeros instead of pcode or malware definitions. The CrowdStrike driver that processes and handles these updates is not very resilient and lacks adequate parameter validation, leading to the entire system crashing and depositing users into the recovery blue screen. Windows does offer facilities to boot without certain drivers, but CrowdStrike marked their driver as a boot driver, requiring physical access to the machine to delete the problematic file and fix the issue.

      via www.summarize.tech

      1. Pascal Monett Silver badge

        "CrowdStrike marked their driver as a boot driver"

        So it is down to the shitty Windows environment once again. No surprise there.

        The issue might be more in the fact that it seems to be very difficult to isolate the kernel in an OS while granting security access to protective measures, all the while keeping the hoi polloi at bay and the system secure.

        Too late to think about a redesign of the OS architecture. We'll just have to make do with what we have.

        1. richardcox13

          Re: "CrowdStrike marked their driver as a boot driver"

          Any OS, from a kernel component will have this problem.

          Eg. RedHat: having a kernel panic due to CloudStrike: https://access.redhat.com/solutions/7068083

          Edit: note I make no comment on whether the use of definitions + driver in kernel mode is a good idea[1] rather than the driver just being a bridge to user mode worker where everything happens[2].

          [1] I don't think it is.

          [2] The chance of a company like CrowdStrike getting this right (handling the worker failing...) seems low.

          1. DS999 Silver badge

            Linux only had problems if the Linux admins were stupid

            On Linux Crowdstrike's software can work either as a kernel module or can work entirely in user mode using eBPF, something which is not an option for their Windows version.

            While it isn't impossible to crash the kernel operating as user mode (i.e. if there is a kernel bug it tickles) you can't get into a reboot loop that way and that's what caused most of the chaos Friday - requiring manual intervention in most cases rather than being able to deliver a fix (even if "disable crowdstrike" was the temporary fix) via the network to every affected machine at once.

            1. Anonymous Coward
              Anonymous Coward

              Re: Linux only had problems if the Linux admins were stupid

              Let's not forget it was made a lot more complicated by Bitlocker. Not that I'm against it, but it did add to the trouble.

              I'm hoping companies will update their BCM with such a scenario and go back to diversifying their IT. I know that Microsofties want it all, but having a few systems with a different OS around gives you at least a fighting chance if you have a cascade failure.

              Also, keep some PXE resources on standby. Pixies can help :)

              1. Nifty

                Re: Linux only had problems if the Linux admins were stupid

                "Let's not forget it was made a lot more complicated by Bitlocker"

                I use Bitlocker on all my Windows machines, in fact I think it's a default on Windows 11. While there's no risk to me from from CrowdStrike, this is a timely reminder to back up the recovery keys. In the past I've used the key stored in a file to access an old drive I'd removed. What I'd like to know is, if you're trying to do a Safe Mode recovery, do you need to enter the Bitlocker key via the keyboard? Really?

                1. Anonymous Coward
                  Anonymous Coward

                  Re: Linux only had problems if the Linux admins were stupid

                  The problem is that it has great trouble in reading smoke signals ..

                2. John Brown (no body) Silver badge

                  Re: Linux only had problems if the Linux admins were stupid

                  " if you're trying to do a Safe Mode recovery, do you need to enter the Bitlocker key via the keyboard? Really?"

                  Yes, and worse, it's the recovery key that's required, not the users bitlocker password.

                  1. Smartypantz

                    Re: Linux only had problems if the Linux admins were stupid

                    KISS has gone out the Window(s) This is a result, it will only get worse..

                  2. VicMortimer Silver badge
                    Megaphone

                    Re: Linux only had problems if the Linux admins were stupid

                    And the ONLY time a recovery key should EVER be required is for a forgotten password. There's no excuse for Micro$loth's bad behavior here.

                    The only time I've EVER had to use a FileVault key was for a user's forgotten password. Booting a Mac into safe mode just takes a user password.

              2. PoisonTheData

                Re: Linux only had problems if the Linux admins were stupid

                Well now, if Linux got into a reboot loop and an organization mandated LUKS, you'd have the same problem - or maybe worse (At least we could easily look up the MS recovery key).

                If you go into building security policy petrified about all your hosts going down at once, you'd never implement any best practices which might make recovery more difficult.

                Org was was prepared, and sure, recovery took about 3 minutes longer per host because of Bitlocker, as we had to read off the number-only recovery key to the user. Are you suggesting that orgs should leave all HDDs unencrypted "just in case?" I don't think 3 minutes extra because of FDE is too much to ask.

                1. Smartypantz

                  Re: Linux only had problems if the Linux admins were stupid

                  Maybe proper separation of code and user data would have helped, then you didn't have to emit all that CO2 to stupidly encrypt and decrypt the OS code over'n'over.

              3. rlightbody

                Re: Linux only had problems if the Linux admins were stupid

                What it needed was a few machines around that weren't running Crowdstrike - thats the problem here. Some machines running the Defender endpoint instead wouldn't have been impacted at all.

        2. Blazde Silver badge

          Re: "CrowdStrike marked their driver as a boot driver"

          difficult to isolate the kernel in an OS while granting security access to protective measures

          It's impossible, because well designed malware could hide inside the most privileged part of the OS and remain undetectable by security software. We see even hardware enclaves don't really help for that, and their capabilities fall well short of preventing blue screens. A microkernel or whatever you're thinking wouldn't solve that.

          One big reason Windows security is a huge problem is that it's such a ubiquitous OS that any threat actor producing malware for it can deploy and re-deploy that malware many times against most targets they come across, making it a cheap target to attack. If they also have to contend with an unknown OS-level security product on each target then it does raise the bar. Just not very much if that unknown security product is invariably 'Crowdstrike'.

        3. Optimaximal

          Re: "CrowdStrike marked their driver as a boot driver"

          "So it is down to the shitty Windows environment once again. No surprise there."

          I mean, it's not...

          "The precise cause of the IT outage was a null pointer issue in a dynamic data file downloaded as a Cy file, which contained only zeros instead of pcode or malware definitions. The CrowdStrike driver that processes and handles these updates is not very resilient and lacks adequate parameter validation, leading to the entire system crashing and depositing users into the recovery blue screen."

          1. Roland6 Silver badge

            Re: "CrowdStrike marked their driver as a boot driver"

            Whilst the null pointer was down to CrowdStrike, the blue screen and inability of Windows Server and Desktop to handle an illegal memory reference is wholly down to Microsoft.

            It is perhaps good luck that Windows Server (which probably should be regarded as an enhanced desktop operating system), hasn’t been tripped up before now on such a scale. Perhaps this along with 365, will hasten the demise of Windows Server.

      2. werdsmith Silver badge

        Re: Dave Plummer has a different take on this

        Would be interesting to find out how the faulty .cy file got out into the update. Presumably Crowdstrike had tested with a good version, but the bad one sneaked into the package for release.

        1. TonyHoyle

          Re: Dave Plummer has a different take on this

          From what others have said crowdstrike bypassed their own rollout procedure to force the update straight onto production networks, bypassing staging.

          So failures all round.. not only did they not test internally (testing with a different version than you send out is not proper testing) they bypassed measures that would have caught this before it did damage.

          And of course crowdstrike are able to do this with no consequences because the companies all signed contracts absolving them of liabilities.. the millions spent on the cleanup will be borne by others.

          1. werdsmith Silver badge

            Re: Dave Plummer has a different take on this

            "no consequences" ??

            First let's see what the market has to say about it, they have had a huge amount of stock value lost.

            1. mtrantalainen

              Re: Dave Plummer has a different take on this

              I agree. The correct wording would have been "they didn't have any contractual or legal obligation to do any better work".

            2. JoeCool Silver badge

              Re: Dave Plummer has a different take on this

              I would watch for the commercial damages lawsuits. Stock price will affect Blackrock, etc. but payouts will impact operations and management.

            3. Charlie Clark Silver badge
              Stop

              Re: Dave Plummer has a different take on this

              Er, the stock market is not the market and volatitlity after incidents like this is to be expected. However, unless there are any successful law suits, you can expect things to revert to the mean. Worst case is the company gets bought by the competition and rebranded.

              1. John Brown (no body) Silver badge

                Re: Dave Plummer has a different take on this

                "Worst case is the company gets bought by the competition and rebranded."

                ...reducing the enterprise AV ecosystem even more and thus creating an even larger target for the future.

                1. Charlie Clark Silver badge

                  Re: Dave Plummer has a different take on this

                  There is that, too.

                  Doubles all round!

            4. Geoff Campbell Silver badge
              Boffin

              Re: Stock value

              As I type, CrowdStrike stock is 76.25% higher than it was 12 months ago. So the markets have pretty much discounted the entire thing.

              GJC

          2. richardcox13

            Re: Dave Plummer has a different take on this

            > From what others have said crowdstrike bypassed their own rollout procedure to force the update straight onto production networks, bypassing staging.

            Exactly: this isn't the failure of an individual, but an organisational operations failure where a choice to "push and be damned" led to being damned.

            Deeper is the clear failure to do fuzz, and similar, testing within an integration test suite in the publication pipeline. As this would quickly lead to seeing the driver's lack of input validation with respect to the definitions.

          3. This post has been deleted by its author

          4. Anonymous Coward
            Anonymous Coward

            Re: Dave Plummer has a different take on this

            So, basically Crowdstrike has done a Boeing?

            What a surprise..

          5. UnknownUnknown

            Re: Dave Plummer has a different take on this

            Go DevOp’s. You can just add the carnage to the next Sprint.

            Change Control … Pft… Agile. Innit.

            1. JoeCool Silver badge

              Oh please

              Nowhere does Agile demand that you do stupid sh*t.

              The people doing stupid are good enough to wreak havoc with any (dev) process.

              You know the words: Expedite. Special. Emergency. Critical. Executives are watchng the dates. We're professionals.

              1. Nitromoors

                Re: Oh please

                and add the words - My bonus depends on this, and there you have it. Disaster by default.

          6. MrBanana

            Re: Dave Plummer has a different take on this

            Not so much bypassing CrowdStrike's policy, it screwed the end user by not allowing them to make staged updates. There was no way for sysadmins to do a phased rollout, which would have caught this a lot quicker, and caused a lot less damage.

          7. John Brown (no body) Silver badge

            Re: Dave Plummer has a different take on this

            "And of course crowdstrike are able to do this with no consequences because the companies all signed contracts absolving them of liabilities.. the millions spent on the cleanup will be borne by others."

            Depending on jurisdiction, those contracts (or specific clauses) may be invalid or not cover the alleged actions, especially if as above it can be shown Crowdstrike didn't follow their own procedures and the clusterfuck was caused by their own incompetence. Just having a clause in a contract doesn't mean you can pass the buck.

          8. Cliffwilliams44 Silver badge

            Re: Dave Plummer has a different take on this

            Hmmm, could it be to keep the suspicious departure of Joe Biden out of the news cycle?! One wonders!

            1. John Brown (no body) Silver badge

              Re: Dave Plummer has a different take on this

              Oh FFS! Can't you read a calendar? If I was as paranoid as you I might think Bidens departure was to get Crowdstrike out of the news headlines!!

            2. Tunnsie

              Re: Dave Plummer has a different take on this

              Careful son. Liking yourself can lead to blindness

          9. pig

            Re: Dave Plummer has a different take on this

            "bypassing staging" I might fire myself if I did this, or ordered it, in my job.

            To do it for this software, at this scale..... someone ether had big balls, or a criminal lack of understanding of the risk.

            1. UnknownUnknown

              Re: Dave Plummer has a different take on this

              That’s why eCAB’s are approved by C-Suite people riding their own balls.

          10. Anonymous Coward
            Anonymous Coward

            Re: Dave Plummer has a different take on this

            This is why the corporate idiocy of "We want to buy a commercial service contract so we have somebody to sue" is so incredibly stupid.

            They don't have somebody to sue. They've got the same liability exclusion that would be had with free software. They just paid lots of money to get it.

        2. mtrantalainen

          Re: Dave Plummer has a different take on this

          Crowdstrike has not published any detailed information about what caused this failure. The fact that a .sys file full of null bytes got distributed as official version suggests that there was zero testing by Crowdstrike.

          Until they document publicly a LOT better testing and verification process, I would simply get rid of any software by that vendor.

          Kernel mode drivers should have automated testing with 100% code coverage AND mutation testing running 24/7. I can guarantee that Crowdstrike had neither or this problem wouldn't had ever occurred.

          1. Tessier-Ashpool

            Re: Dave Plummer has a different take on this

            Maybe their drivers do have automated tests, but it still has to be deployed at the end of the day. If there's an esoteric problem in the deployment process, things could go awry.

            I imagine heads are banging together in Crowdstrike wondering how they can/should stagger their updates and have better eyes on the results, so that failures are less catastrophic. Doubtless the updated servers were expected to continue sending telemetry back home. Why didn't something detect the absence of signals to stop the update in its tracks?

          2. TheMeerkat Silver badge

            Re: Dave Plummer has a different take on this

            It did publish enough for anyone understanding software to know that it was not a software update distributed but what effectively is a bunch of malware signatures.

      3. Doctor Syntax Silver badge

        Re: Dave Plummer has a different take on this

        "The CrowdStrike driver that processes and handles these updates is not very resilient"

        And there's the real problem. Anything with that privilege needs to be very resilient.

        Does the driver require signing by Microsoft to be allowed this access? If so then Microsoft need to exert some strict QA before doing so. And, yes, I recognise that there might be a slight problem/irony (choose according to personal preference) there.

        1. MrBanana

          Re: Dave Plummer has a different take on this

          If you have time, watch Dave's YouTube video, less than 15 minutes, very informative. At least read the second comment above. Yes the driver is certified and signed. But, it has the ability to dynaimcally load other code that is not certified or signed. In this particualr case it was not tested either. In itself it is a problem, but made much worse by CrowdStrike being able to immediatley push the dynamic code update to all computers in its install base worldwide, bypassing any phased rollout that a normal sysadmin team would employ.

      4. Blazde Silver badge

        Re: Dave Plummer has a different take on this

        "Null pointer issue" in Windows driver is the silver lining in this tragedy. It's a poster-child-in-waiting for getting Rust used in future drivers. https://techcommunity.microsoft.com/t5/surface-it-pro-blog/open-source-rust-driver-development-platform/ba-p/3974222

        1. Anonymous Coward
          Anonymous Coward

          Re: Dave Plummer has a different take on this

          > "Null pointer issue" in Windows driver

          We know it was a file of nulls. Dave Plummer reckoned it should have been full of pcode and/or some definition data.

          Do you have a citation for your implied claim it was instead meant to full of directly executable data/code which lead to (some of) the zeroes being deref'ed as a pointer by the CPU?

          1. FIA Silver badge

            Re: Dave Plummer has a different take on this

            We know it was a file of nulls. Dave Plummer reckoned it should have been full of pcode and/or some definition data.

            Do you have a citation for your implied claim it was instead meant to full of directly executable data/code which lead to (some of) the zeroes being deref'ed as a pointer by the CPU?

            It wasn't executed directly, but it wasn't validated. At about 9:36 Dave Plummer references a crash dump, the instruction is loading a register with data pointed to by an address in another register. That address is a very low value, indicating that probably an unchecked zero pointer had just been added to (to get the offset in the desired structure).

            So it doesn't have to be directly executable to be blindly used as an address pointer by their driver it seems. :(

          2. Blazde Silver badge

            Re: Dave Plummer has a different take on this

            That's not my claim at all. You're confusing zeros in the config file with the null pointer which also happens to be zero. The null pointer clearly derives from a failure to initialise some object and the bug is that the privileged driver code does not check that object initialised correctly before accessing it. It's a bread-and-butter C/C++ failure scenario which doesn't happen in (non-unsafe) Rust because in Rust the type of an object which doesn't exist is always different from the type of an object which does exist, and your code is forced to handle both cases at minimum in memory-safe ways.

            The corrupted config file happened to trigger the pre-existing driver bug. It'll be interesting to see whether the bug itself was exploitable. Null pointer dereferences typically aren't but there might be some more complex behaviour behind the bug that was exploitable.

      5. martinusher Silver badge

        Re: Dave Plummer has a different take on this

        I might be a bit thick but the explanation he gave as to who/what/why made sense but at the same time completely undercut the MSFT "CYA" explanation ("Its the EU wot made us do it"). When you strip the noise aside what you've got is Crowdstrike figuring out how to run their code inside not just the kernel ring but at the bootstrap level without going through all that tedious and time consuming Windows certification business. Even so the fault might have been benign if Crowdstrike that thought through the "What Ifs?" -- Murphy's Law being what it is anything that can go wrong will eventually go wrong -- so software design has to reflect this and so include a get out of jail free card of some sort.

        From a real time/embedded designers perspective this belongs under "Rookie Mistake". But it is fun seeing the corporate comms types wriggling around trying to point the finger (and legal hasn't even begun to get warmed up yet). (Obviously the blame will eventually fall on some hapless middle manager or even an unlucky programmer, its the way things are done -- you take someone outside and shoot them "por encourager les autres".)

      6. that one in the corner Silver badge

        Re: Dave Plummer has a different take on this

        > downloaded as a Cy file, which contained only zeros instead of pcode or malware definitions

        So, we learn that zero is not NOP in their pcode?

        Then again, NOP is hardly ever opcode zero, which can be considered unfortunate, as getting a file full of zeroes is one of THE classic blunders[1] (just ahead of a file full of all ones or a serial connection full of curly braces).

        And if they'd just read it as good old fashioned ASCII, a sequence of NULs would just get ignored; almost as if they knew what they were doing back in 1963.

        62 years later...

        [1] The most famous of which is, ‘never get involved in a land war in Asia'

        1. that one in the corner Silver badge

          Re: Dave Plummer has a different take on this

          > back in 1963

          > 62 years later...

          We find out why I've been getting complaints about post-dating cheques by a year.

      7. cyberdemon Silver badge
        Headmaster

        Re: Dave Plummer has a different take on this

        Thanks for letting me avoid YouTube, but that summariser needs to learn where to add a paragraph break.

      8. shawn.grinter

        Re: Dave Plummer has a different take on this

        I watched this too and it' an excellent review. One extra piece of information is that CrowdStrike moved their development to India in Feb 2024.

        Related, I couldn't possibly comment

        :-)

      9. Tridac

        Re: Dave Plummer has a different take on this

        Null pointer (illegal memory address) access should be trapped by the cpu, often with hardware interrupt posted to the OS. It shold never bring whole OS down, this is not 1985. Such a fundamental requirement in OS design, it's amazing that it wasn't handled properly. Still, this is windows...

        1. cyberdemon Silver badge
          Devil

          Re: Dave Plummer has a different take on this

          Well, evidently it IS trapped by the CPU, but the OS doesn't know what to do with the interrupt except halt and display a sadface on a blue background

        2. Roland6 Silver badge

          Re: Dave Plummer has a different take on this

          > Null pointer (illegal memory address) access should be trapped by the cpu

          From what I can ascertain this has been possible on the x86 CPU’s since the 286. The questions are thus, does Windows fully use the Intel virtual address space management capabilities and does Windows have an interrupt handler to handle illegal memory address/reference events and thus gracefully handle these events without tripping to the ie screen of death.

          The follow on question is: what do other OS’s do in this situation(*)

          (*) By other, I don’t just mean Linux and Unix and their variants, I include VMS (remember David Cutler famously drew on VMS when designing NT) etc. ie. Memory safe programming goes deeper than just Rust vs C, it also needs the OS and hardware platform to implement memory safe features.

      10. jack d

        Re: Dave Plummer has a different take on this - but from the legal standpoint...

        I'm no lawyer, but I realize that when you sign a contract for IT services or software with support, you agree to certain vendor liability limitation clauses. But all this is based on an understanding that the Parties to the agreement act in good faith and in a professional manner. In case of CrowdStrike, judged from what we know already, the company acted with gross negligence, unprofessionally, not according to standards and presumably without proper supervision. All this is a good basis for claiming damages due to gross negligence and acting not based on industry standards.

    2. steviebuk Silver badge

      Re: Dave Plummer has a different take on this

      Microsoft gone to shit with their q&a sit satnav took over.

      This was both their fault. Croudstrike for sending a cocked up file. And Microsoft's fault for allowing it to use a unique method to update the driver. Everyone else has to get recertified yet to avoid this, crowdstrike was allowed to update from a file. Thats a Microsoft fault not EU fault.

      Microsoft just pissed because they were forced into competition "See. You had just let us monopolize, none of this would of happened"

      Oh go fuck a duck MS and Sat Nav

  2. blackcat Silver badge

    WHQL

    This vid is worth a watch

    https://www.youtube.com/watch?v=wAzEJxOo1ts

    The crowdstrike driver was WHQL tested and signed. There are some checks and balances on what can run at that level. It just seems to be at this time that the kernel driver comes up a little short on data validation.

    1. katrinab Silver badge
      Alert

      Re: WHQL

      But the driver executes code from a different file that isn't WHQL tested, and you can replace that file without installing a new driver, but the driver still behaves differently when you do.

      I can't watch your video link right now, but if you are linking to the David Plummer video, I watched that last night, and he explains that.

      1. blackcat Silver badge

        Re: WHQL

        That is the slightly shocking bit that MS approved of this setup. You should not be able to feed new code to something running in the kernel space from the user space and certainly not without huge amounts of checking.

        Yeah, its Dave's video.

        1. Doctor Syntax Silver badge

          Re: WHQL

          Without ensuring the driver performs adequate testing it should not have been signed at all.

        2. richardcox13

          Re: WHQL

          Remember WHQL validates that a driver calls APIs correctly doesn't make a mess of interrupt handling, etc.

          That there was a dependency on a definition file would not be in scope, because that is not the kind of thing that drivers for hardware drives have done historically. It would be great to see WHQL being updated to broaden tests.

          Even better would be MS saying new "no third party drivers in the kernel" (Windows has long supported non-kernel drivers). However that would depend on MS being willing to fight through US and European counts that it isn't an anti-competitive measure (ie. third party software writer's laziness is an insufficient reason to allow kernel access).

          1. mtrantalainen

            Re: WHQL

            All we would need is "WHQL acceptance is not possible if the driver reads ANY data files and there is not automated fuzzing testing enabled 24/7". And WHQL testing would include at least simple fuzz testing by intentionally corrupting any data files the driver loads.

        3. John Riddoch

          Re: WHQL

          You're hitting a set of requirements that kinda force this situation:

          • AV has to run in the kernel to be able to detect and prevent virus/malware attacks
          • Kernel driver has to be WHQL certified - a process which takes a defined amount of time
          • New virus/malware signatures need to be rolled out on an almost daily basis to match the unrelenting grind of the virus/malware writers trying to bypass your tools
          Combine all these requirements and it becomes nigh on impossible to write a functioning AV solution which can be updated quickly enough to adapt to the threats out there, so it ends up having to run code outside of the certified driver. There are probably ways to make it more resilient, but I'm not a kernel developer/coder so don't know how messy that would get.

          Just to add to the chaos; if you assume every AV update is a new signed driver, you have to unload the old driver and attach the new one, leaving a short period the system is unprotected, assuming you can easily remove the old driver without a reboot.

          This doesn't forgive the monumental screw-up that Crowdstrike have made, but it does show why certain design decisions were made.

          1. Ken Hagan Gold badge

            Re: WHQL

            It may be necessary for some part of the AV system to run at ring 0. It may be necessary for some part of the AV system to accept updates. Both at the same time? I think that's unproven.

            1. stiine Silver badge

              Re: WHQL

              You're suggesting that a piece of code should be able to rebase itself from ring0 to ring1 and back, on its own? Please stop.

              1. Anonymous Coward
                Anonymous Coward

                Re: WHQL

                > You're suggesting that a piece of code should be able to rebase itself from ring0 to ring1 and back, on its own?

                Maybe he is just suggesting that the code could be run in two chunks, a user space portion and a driver portion; hmm, that model seems strangely familiar.

                1. TheMeerkat Silver badge

                  Re: WHQL

                  You can run code in two chunks but then you will still need to send the update to the bit running low to tell it to look for new threats.

                  And now you have to secure this extra communication channel.

                2. Snake Silver badge

                  Re: code run in two chunks

                  AV systems on Windows do exactly that and have done so for decades: the detection engine runs in ring0 and the UI controls run on, usually, ring3. This is why the UI can crash and not take down the kernel in a BSOD.

          2. JoeCool Silver badge

            Re: WHQL

            I'm pretty sure the software terms do not state that

            " in an effort to react instantly to threats, CrowdStrike may roll out untested chages that could take down your IT systems"

            1. Claptrap314 Silver badge

              Re: WHQL

              Don't be.

          3. UnknownUnknown

            Re: WHQL

            Not having AV Software that shits the bed if the updated file is bad seems to come to mind as being desirable..

          4. John Robson Silver badge

            Re: WHQL

            "New virus/malware signatures need to be rolled out on an almost daily basis "

            OR - you can whitelist instead of blacklisting.

        4. that one in the corner Silver badge

          Re: WHQL

          > You should not be able to feed new code to something running in the kernel space from the user space and certainly not without huge amounts of checking.

          Sod running in kernel space.

          How about just sanity checking the contents of a file, any file in any process at any priority level, before blindly interpreting its content.

          Like, in a binary (data) file, checking for magic bytes, the checksum at the end... And just refusing to touch it when it is clearly insane.

          That is basic stuff for any program, surely?

          If a file with a duff photo can be calmly rejected because it doesn't have the JFIF magic numbers, but ...

        5. TheMeerkat Silver badge

          Re: WHQL

          Some people can’t read.

          MS had no choice but approve the setup otherwise the EU would have issues with MS.

    2. mtrantalainen

      Re: WHQL

      I think this is a good demonstration how little value WHQL testing can actually provide.

      1. cyberdemon Silver badge
        Holmes

        Re: WHQL

        I wonder if Data Execution Prevention would have helped here? Even if only by forcing CloudStrife to write a better piece of software

        Dave Plummer seems to suspect that the update contained executable code, which was pulled into the "signed" driver and executed. DEP should have prevented that?

        On the other hand, the file could still have been plain old data that caused the driver to generate a null-pointer.

        Apparently the file in question was all zeroes, so they obviously have no input validation whatsoever.

  3. Khaptain Silver badge

    Can an AV be effective if not in Ring 0

    If an AV is running anywhere else than around the kernel, can it still guarantee that it can do it's job ? If it was in another non-kernel tier would it not lose the capacity to protect itself from the dark side.

    And wasn't the EUs request done in order to push against MS's monopolistic nature ?

    1. Anonymous Coward
      Anonymous Coward

      Re: Can an AV be effective if not in Ring 0

      You can have a proper kernel-user split putting a service which is able to do the privileged parts but is essentially simple in ring 0. But that's hard, so AV companies don't do that.

      1. Doctor Syntax Silver badge

        Re: Can an AV be effective if not in Ring 0

        Doing hard stuff is what's expected of AV companies.

        1. Anonymous Coward
          Anonymous Coward

          Re: Can an AV be effective if not in Ring 0

          > Doing hard stuff is what's expected of AV companies.

          It always pays to temper your expectations.

          OTOH, hope springs eternal, so we can only hope they are actually capable of doing the hard stuff.

    2. toejam++

      Re: Can an AV be effective if not in Ring 0

      You need a hook up in kernel-space, but the rest of it can run in user-space. There are some performance drawbacks going that route, but it does lend additional stability to the system.

      On a side note, Minix is a good example of an OS that runs as few things in kernel-space as possible.

      1. John Riddoch

        Re: Can an AV be effective if not in Ring 0

        Anything run in user-space is vulnerable to being hijacked by a virus/malware and is harder to make resilient. Not impossible, but significantly harder and even if you think you've got it right, the bad guys will be continually probing for some kind of a weakness to disable your protection.

        1. Ken Hagan Gold badge

          Re: Can an AV be effective if not in Ring 0

          Not true. If you can break into kernel space then you can modify any user space, yes. But why bother, since you are in kernel space and have already won the game.

          On the other hand, if you are still stuck in user space then the user space of a different user or logon session is quite off limits.

  4. MatthewSt Silver badge

    Running it outside of Kernel mode isn't the answer no matter what the question is. If the code calls out to User mode and crashes, what does the Kernel code that made the call do? Does it fail open (no security) or does it fail closed (hard crash).

    Question remains the same whether you're dealing with Kernel mode or not. Microsoft could have put the equivalent of "ON ERROR RESUME NEXT" in when calling third party Kernel libraries, but you've got the same problem then.

    And I can bet everyone would be up in arms if Microsoft left things insecure by default...!

    The answer is test what you're shipping and roll it out slowly.

    1. Dan 55 Silver badge

      Easy answer that, it should flag the problem in the event viewer and with Crowdstrike, continue with the previous good configuration, and not lose its shit and cause a bootloop.

      Also nothing the EU said obliged Microsoft to put or keep ropey software architecture in Windows. They're just finding someone else to blame and if it can be that horrible organisation that dares to regulate them a bit then so much the better.

      1. MatthewSt Silver badge

        So failing open but with logging.

        There's no such thing as a last known good configuration if something updates itself outside of the normal Windows process.

        1. Dan 55 Silver badge

          I don't think that continuing to run using definitions which were fine until 04:08 UTC is the same as "no security".

          Imagine that the process on 8.5 million Windows PCs logged the definition error with Crowdstrike. That would have set alarm bells ringing and I'm pretty sure that another corrected update could have been pushed by Crowdstrike on the same day without taking out airports, airlines, trains, hospitals, GPs, etc...

          1. Robert Carnegie Silver badge

            But if the PC is bluescreened then how is the AV software going to phone home to Crowdstrike?

            1. Dan 55 Silver badge

              By catching the error previously with better validation of new definition files.

            2. richardcox13

              If you are outside the kernel you restart the user process (there are well established approaches to this).

              Dropping back to previous definitions after n failures is also an option (works very well with good telemetry).

              1. Robert Carnegie Silver badge

                That seems possible. If boot fails after getting updated data, then the next boot could run ignoring the updated data. But maybe then it shouldn't treat itself as secure. And it could be abused, to trick the software into disabling its latest update.

                Somewhere in the media coverage, there was a a report of a communication from Crowdstrike advising - before this happened, I think - that you have and should use an option to assign various of your PCs to use the latest Crowdstrike data update N, or use update N-1, or N-2. That would have a similar result when update N is bad - the PCs that weren't using update N would survive... at first.

          2. Mike007 Silver badge

            What are these "definitions"? Which API are your referring to that loads "definitions" in to the kernel?

            Windows was loading the exact same driver it loaded last boot, which has not modified... As to what that third party code does when it runs, that's outside of Microsoft's control.

            If I make a program for that loads some external data, I don't want windows going "naa, I don't want to give you the data you asked for, so here's some different data instead"

            1. Dan 55 Silver badge

              Windows was loading the exact same driver it loaded last boot, which has not modified...

              The driver crashed repeatedly. How did Windows deal with it? Badly. If a driver fails often enough it should be put on the naughty step by the OS.

              If I make a program for that loads some external data, I don't want windows going "naa, I don't want to give you the data you asked for, so here's some different data instead"

              I don't understand. Crowdstrike's driver gets Crowdstrike's data but crashes. I don't care how the driver crash happens, but if it happens repeatedly I do want Windows to eventually disable the driver so I can get on with my day.

              Crowdstrike would have to push out a driver update with Windows Update after explaining to Microsoft what went wrong and what they did about it, but that shouldn't be my concern.

              1. Mike007 Silver badge

                Pretty sure most IT teams would much rather have an outage like this than have all of their security software just disable itself whenever there is a problem.

                After the AV has been detected as an inconvenience to the user and automatically disabled... Who is liable for all of your secrets being sent to your competitors by malware that should have easily been blocked?

                If this were a setting that you had access to, would you change it on your company PC? Would you not agree that any employee who did this and then got compromised should be fired immediately? And if the IT team authorised this, they should probably be replaced as well?

              2. Anonymous Coward
                Anonymous Coward

                So all a virus writer needs to do is make the loaded Kernel AV modules crash X times in a row and windows will silently turn off your AV?

                1. Ken Hagan Gold badge

                  Well yes, I suppose, but the question is rather odd because if they can repeatedly crash your kernel, you have already lost.

                  1. MatthewSt Silver badge

                    Yes, but you _know_ you've lost and someone needs to fix it, rather than the system carrying on unprotected with malware having free reign

              3. Anonymous Coward
                Anonymous Coward

                > The driver crashed repeatedly.

                Did it? Didn't it just BSOD and sit there doing bugger all until a human intervened.

                It failed many, many times across the world, but on many, many PCs..

                > How did Windows deal with it? Badly. If a driver fails often enough it should be put on the naughty step by the OS

                The advice was to manually reboot your PC, fifteen times or more; and users reported that that worked.

                So - was that the OS being given the chance to see it fail repeatedly and put it on the naughty step?

              4. Pier Reviewer

                “…if it happens repeatedly I do want Windows to eventually disable the driver so I can get on with my day.”

                And you’ll be happy to know that Windows does just that! Unless you flag your driver as “required for boot”. Guess what CrowdStrike did?…

                Windows was told it couldn’t boot without the driver, so it loaded the driver. This ain’t an MS issue. This was 95% CrowdStrike and 5% IT depts who are admittedly overstretched and have neither the time nor resources to deploy patches to their fleet in waves and stop if something goes south. But mostly CrowdStrike.

                Srsly. No null check? Bypassing their own deployment protocols?! Jebus H Almighty…

          3. MatthewSt Silver badge

            That's a good suggestion, but that's not Microsoft's job. The whole reason you're running Crowdstrike is because you think they're doing a better job than Microsoft. It's a black box as far as Microsoft is concerned. There's no concept of definitions, there's nothing to roll back, there's no notification that a change to your system has been made.

            Crowdstrike needed to detect that their driver had sh*t the bed and done their own rollback. It was their code that threw the blue screen. There's plenty wrong with Microsoft software without blaming them for others too.

            1. Dan 55 Silver badge

              Crowdstrike's job is better validation in the driver or preferably in a userland subprocess to reject corrupt definition files so it continues working in a reasonable way until new definition files are issued in a matter of hours.

              Microsoft's job is to make sure repeated driver crashes are not fatal to the entire OS. If Crowdstrike's driver repeatedly crashes in a short space of time, then Windows should disable the driver.

              1. Brewster's Angle Grinder Silver badge

                And now you have two problems...

                We know the answer because we know it was a faulty AV update and, in that particular situation, disabling the AV was best. The kernel has none of that hindsight. Stopping a random driver would likely leave the machine useless; for example, booting without a working graphics driver, or a hard disk driver. And I have no idea the consequences of losing the "PCI-to-PCI Bridge" or the "High precision event timer", but I bet it's not good.

                And the trickle down could make the situation worse: or do real damage. Even in this case, if it wasn't a faulty update but malware, then stopping the driver could allow the malware free run. (And Microsoft would get it in the neck if they disable AV, and opportunistic malware takes advantage of AV being down.) And do we know if CrowdStrike have just one driver, or multiple that interact? Has the system been tested with one down?

                Anyway, the correct response to an unknown error is always to stop, do nothing more, and wait for help. If there's any solution, it's that data files need to be registered as part of kernel state so a rollback can be attempted to last known good.

            2. Doctor Syntax Silver badge

              "That's a good suggestion, but that's not Microsoft's job."

              If Microsoft sign the driver in order to gain that access they do have a job to do which is to require it to be able to roll back and do so. Microsoft are a gatekeeper here. If they say that in order to gain access a third party has to meet quality requirements then that third party has to meet those requirements or stay outside.

              The only basis for a regulator to quibble with that would be if Microsoft gave itself a free pass not to meet those requirements itself.

              1. Ken Hagan Gold badge

                MS could, for example, demand source code for the kernel part of your system and sign only their build of your code. There would of course, be a fee for the code review and no guarantee that they'd agree to sign at the end of it.

                This is, in effect, what Linux does if you don't want them to mark the kernel as "tainted" by your driver.

                1. Justthefacts Silver badge

                  Perfectly reasonable, and that’s the process that Microsoft wanted to follow.

                  Unfortunately, Crowdstrike didn’t want to play ball. They insisted on a backdoored architecture, where the signed thing then imported an *un-validated* file into ring-0 execution. Then, they got the EU to rubber-stamp that, if Microsoft refused to accept that crazy architecture, the EU would prevent MS Windows being sold in Europe, as abuse of monopoly. And unfortunately MS then allowed EU to determine their security architecture. And now the rest of the world have paid the price.

                  1. Anonymous Coward
                    Anonymous Coward

                    > Crowdstrike didn’t want to play ball.

                    Citation?

                    > They insisted on a backdoored architecture...

                    Citation?

                    > Then, they got the EU to rubber-stamp that,

                    Citation?

        2. Doctor Syntax Silver badge

          "There's no such thing as a last known good configuration if something updates itself outside of the normal Windows process."

          If it updates itself it can revert itself to its last known good configuration if it has maintained a copy of that. If it can't then either the kernel should then fail it or, if it isn't designed to do that, it goesn't get the signature to allow it into the kernel at all.

      2. Spazturtle Silver badge

        "Easy answer that, it should flag the problem in the event viewer and with Crowdstrike, continue with the previous good configuration, and not lose its shit and cause a bootloop."

        That is the default for faulty kernel drivers, but the Crowdstrike driver had flagged itself as being required for boot.

        The whole point of the boot flag in kernel drivers is to tell Windows that halting is safer than continue booting without them.

        1. Dan 55 Silver badge

          Really after three failed boots due to one driver, it should be obvious that trying to boot without that driver is safer than continuing to try and boot with it.

          1. Mike007 Silver badge

            I am not sure I would categorise automatically disabling stuff (in the context of security software!) and carrying on as if everything is normal as being "safer" than demanding someone fix the broken system.

            1. Dan 55 Silver badge

              Then make it a group policy. Let the customer have the choice of what the best action to take... for the PC to continue bashing its head against a brick wall and the company to grind to a halt or for the driver to be temporarily disabled.

              A lot of people seem to think the way it's done now in Windows is the only way but it's obvious that the IT world has to come up with something better.

              1. Doctor Syntax Silver badge

                "t's obvious that the IT world has to come up with something better."

                And that receives downvotes? No wonder the IT world is in a mess.

                1. Anonymous Coward
                  Anonymous Coward

                  The IT world is a mess because there is a large demand for skilled IT workers but not enough talent or intelligence to fill those roles. Imagine if violin playing or poetry writing became highly paid. You would get marginally more Heifetzes and Shakespeares but they would be drowned out by the din of screeching strings and clanging stale rhymes.

                  Most IT workers are indifferent clock punchers who have no interest making things more resilient or reliable. They want to collect their paycheck and head to the pub. A smaller percentage are ambitious schemers who devote their time to office politics not technical skills. So only a scant few are left to keep things running in the face of disaster.

                  1. The Oncoming Scorn Silver badge
                    Holmes

                    Most IT workers are at the coal face, suffering the whims of Upper Manglement trying to save a few shekels by switching software vendors, security solutions & managed service providers regardless of the disruption & indirect associated costs.

                    I am passionate about my role at the facility I'm at, but just about emerged through the latter one last year (After 6 months in limbo), when the incoming company had to sub contract from the outgoing company due to them being unable to find a suitably security cleared agency & staff, colleagues in the US were not so lucky.

                    That same company is now allegedly making a bid to buy my employers MSP, they have a habit of getting rid of experienced staff, replacing them with cheaper & younger. I I fully expect that in 2.5 years, I will either be pushed out by that same company if successful, or if invited to reapply for my role with their IBM preferred outsourced agency at a substantial rate cut. Despite the value the actual client has stated that I bring to the plant.

                    Then you wonder why many IT Workers are indifferent clock punchers, when because we are outsourced we have no direct stake in things compared to in-house staff (Who usually have their own fears of being "sold" to a new MSP).

          2. MatthewSt Silver badge

            What if we're dealing with a system with two drivers. One. Starts a centrifuge spinning and the other checks the status of all the components and applies the brake when something goes wrong. Problem lies in the 2nd driver, so Windows goes "fine, I'll start and ignore that one".

            Centrifuge starts, problem occurs, system physically destroyed.

            Current method has less danger as centrifuge won't even start.

            1. Dan 55 Silver badge

              I'm sure even Microsoft's programmers can maintain separate counters for separate drivers.

              1. Anonymous Coward
                Anonymous Coward

                If they could, there would be no requirement for AV in the first place. Additionally, this is why using Windows Defender is a poor idea since the Defender developers were trained exactly the same way as the core WIndows developers...

            2. Yet Another Anonymous coward Silver badge

              Or Airbus's new military transport aircraft.

              Engine management software loads new engine performance definition file, discovers that file is missing/corrupt/invalid (don't remember details) and decides to shut down engine management software. Shutting down all 4 engines and subsequently the airframe and the flight crew.

            3. Claptrap314 Silver badge

              Your proposed architecture is what is at fault in your scenario.

              It sounds very much like the Cloud Strike architecture is also very much at fault. And the design. And the implementation. And the execution.

          3. TonyHoyle

            But crowdstrike had labelled their driver absolutely required for boot.

            The system had no choice. It can't make value judgments.. it has to use the information it has.

            1. Doctor Syntax Silver badge

              Crowdstrike had been allowed to label their driver as absolutely required for boot. Once installed the system may not be able to make value judgements. Whoever allowed it to be so labelled could and should have made such judgements.

              1. This post has been deleted by its author

              2. Ken Hagan Gold badge

                So who is "whoever"?.

                Is it the EU who demanded that MS keep the market open? Is it Crowdstrike who insusted that their product should include such drivers, but who clearly didn't fuzz test the data input? Is it the customer who chose MS and CS as their "joint" system vendors?

                1. Displacement Activity

                  So who is "whoever"?.

                  Is it the EU who demanded that MS keep the market open? Is it Crowdstrike who insusted that their product should include such drivers, but who clearly didn't fuzz test the data input? Is it the customer who chose MS and CS as their "joint" system vendors?

                  I'd have thought that's obvious. It's the customer, for not carrying out the due diligence to determine that their mission-critical system relied on a pile of amateur-level crock supplied by MS and CS. All this after-the-fact whining is just infantile.

                  And MS's attempt to pass the blame on to the EU is equally infantile. All the APIs in, for example, the Linux kernel are published. This doesn't make anything insecure. Security is an end-to-end process, and that process failed spectacularly here.

              3. Anonymous Coward
                Anonymous Coward

                > Whoever allowed it to be so labelled could and should have made such judgements.

                That would be the IT department, just as they clicked "Install".

                It is published info that AV works at this deep level - and it is Microsoft who published that, in the APIs publicly released as asked for by the EU* and that it therefore has every opportunity to screw up.

                * and used without being documented before that point: want to get it was really only MS code that called publicly undocumented APIs?

          4. Zibob Silver badge

            That's funny because...

            ... I remember with Win10, (I am currently using 11 and have not had to see this yet) when you tried to booth and it failed,it auto restarted and tried again, 3 times, then it knew it was failing somewhere and instead shut off completely or booted into startup options.

            So there is precedent for this kind of behaviour in windows already, even if not exactly as you described.

            So it should be possible, just need a low enough running watchdog that can catch boot errors at, well, boot time.

          5. mtrantalainen

            CrowdStrike chose "fail to locked" state for failure case, which is the only safe option if you don't want your security solution to randomly turn off completely if attacker can cause it to crash.

            However, when you distribute kernel mode drivers which are designed to fail to locked state, you MUST have data fuzzing and mutation testing going. From everything I can see this far, it's totally clear that CrowdStrike has neither!

            1. that one in the corner Silver badge

              > you MUST have data fuzzing and mutation testing going. From everything I can see this far, it's totally clear that CrowdStrike has neither!

              Good grief, CrowdStrike clearly didn't even have basic, trivial, content validation on their files, like magic numbers and a checksum.

              Yes, they should fuzz etc, but for pity's sake, call them out on missing the truly basic stuff first!

              1. CrazyOldCatMan Silver badge

                and a checksum

                Trouble is that, if your imput file is all zeroes, then the checksum is also zero and so it would pass..

                1. James Hughes 1

                  If (checksum == 0)

                  error = CheckIfFileIsAllZero();

                2. ExampleOne

                  Depends on the checksum you use, but the md5sum of zero is NOT zero.

                  $ md5sum <(echo 0)

                  897316929176464ebc9ad085f31e7284 /dev/fd/63

      3. CrazyOldCatMan Silver badge

        Also nothing the EU said obliged Microsoft to put or keep ropey software architecture in Windows

        And Apple are bound by similar terms - so they developed a mechanism so that AV software could get the access it needed without having to run at the kernel level.

    2. Anonymous Coward
      Anonymous Coward

      And I can bet everyone is up in arms because Microsoft leaves things insecure by default...!

      There, FTFY

    3. Doctor Syntax Silver badge

      "Question remains the same whether you're dealing with Kernel mode or not"

      I'd say the question remains "Was Friday's event acceptable or not?"

      1. Anonymous Coward
        Anonymous Coward

        accepable?

        To whom? Kaspersky? Mcafee? Sophos? Avast? I'm sure it was.

    4. /dev/null++

      Meanwhile at Microsoft HQ...

      "Alright team, let's say it together: Anything can be null... or a complete dumpster fire! Be defensive in your coding. Let's review our code."

      Elsewhere in the room, two developers talk...

      "I don't care, I'm just here for the money. If I have to check everything, it looks like I'm slow, and the new 20-year-old manager will give me a bad performance review."

      "Any questions? Yes, you in the back."

      "Can't we just use Linux?"

      "Exactly the kind of productive questions we are looking for. Considering how many important businesses don’t run Linux, I’m sure it wouldn’t take many hours to migrate OS, software, and applications. Education of staff, excellent question. You may go home now. Starting tomorrow, you’ll be assisting our facilities service. I’m sure they will appreciate your ideas too."

      Just to mention, my first IT job was as an admin on FreeBSD and Linux back in 1997. Since 2001, I’ve worked primarily with Python and C#, I do try to code defensively, for every one of me, there are so, so many users and input that I can't even imagine what sort of crap is thrown at the systems. Personally I use Windows because... I don’t get anything in Linux that I can't with Windows. But, there are several programs that only run on Windows. And honestly, I don't care what I run. It just needs to work. I can break Linux just as quickly as Windows, also I'm also not worried if my car has Goodyear or Continental tires. Maybe it has to do with age +55 Do your best, learn for mistakes, do better tomorrow.

      1. Anonymous Coward
        Anonymous Coward

        No, that twenty-something manager and his buddy had extremely large short positions in Crowdstrike that they purchased on Wednesday.

    5. maffski

      The answer is test what you're shipping and roll it out slowly.

      '...The answer is test what you're shipping and roll it out slowly....'

      The answer is to sue CrowdStrike into the ground so that every other security company decides hardening their drivers is a good investment.

  5. alain williams Silver badge

    Wrong question

    We should be asking why something like ClownStrike was considered necessary in the first place. It does not seem to be to just address the weaknesses in MS Windows as there are versions available for Linux and macOS. I have been hearing things about it being required by insurance and compliance.

    It would be interesting to see a cost benefit analysis: what does it cost to install vs what does it prevent - especially since it does have a history of causing outages.

    I suppose from the PHB point of view this ticks a box as otherwise getting security done right takes time and expertise that a business often does not want to pay for.

    1. Khaptain Silver badge

      Re: Wrong question

      "We should be asking why something like ClownStrike was considered necessary in the first place. "

      There is no such thing a fully protected system, we all know this and there are bunch of bastards that make their living from encrypting all your files, it is not easy job to continually protect yourself against them. Crowdstrike like all of the EDRs give you at least some hope, even though it is only a a part of a larger solution, and yes user training is part of the solution... but nothing is perfect so we make do with what we can.

    2. Cruachan Bronze badge

      Re: Wrong question

      It depends on the company, but a lot of them that use MS products and also Microsoft 365 are already getting AV as part of that license, whilst also having policies not to use a single vendor for everything or having a distrust of Microsoft from a security POV (where, lets be honest, they don't have the most sterling record).

      I made this point at my current contract as they were complaining about costs whilst double-paying for AV (and other things), the rationale was that CrowdStrike was "better" than the MS products they were already paying for. I suspect that particular argument won't wash any more though.

      1. Khaptain Silver badge

        Re: Wrong question

        "It depends on the company, but a lot of them that use MS products and also Microsoft 365 are already getting AV as part of that license,"

        Agreed but one has to be extremely careful about exactly which licence you have .. E1, E3 and E5 licences are very different beasts and their offering's are very, very different..

        1. Cruachan Bronze badge

          Re: Wrong question

          This particular company is actually paying three times. CrowdStrike, E5 licenses and they have SCCM plus CALs for every device too. :-D

    3. Anonymous Coward
      Anonymous Coward

      Re: Wrong question

      Box ticking, mainly.

      Keep your shit up to date and not expose anything to the internet that isn't absolutely necessary. Seriously, it's been 10 years since I've seen a virus get past that basic stuff.

      Clownstrike play on fears as if viruses were some kind of intelligent malevolent force, rather than bits of code exploiting bugs - almost always when you hear of a company getting attacked it's something that was updated by the vendor months ago that they haven't got around to updating.

    4. Anonymous Coward
      Anonymous Coward

      Re: Wrong question

      A better question is why was it even installed on critical systems?

      From Crowdstrike's own T&Cs, willingly accepted by every customer:

      https://www.crowdstrike.com/terms-conditions/

      "8.6 [...]TO THE MAXIMUM EXTENT PERMITTED UNDER APPLICABLE LAW, CROWDSTRIKE AND ITS AFFILIATES AND SUPPLIERS SPECIFICALLY DISCLAIM ALL IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, TITLE, AND NON-INFRINGEMENT WITH RESPECT TO THE OFFERINGS AND CROWDSTRIKE TOOLS. THERE IS NO WARRANTY THAT THE OFFERINGS OR CROWDSTRIKE TOOLS WILL BE ERROR FREE, OR THAT THEY WILL OPERATE WITHOUT INTERRUPTION OR WILL FULFILL ANY OF CUSTOMER’S PARTICULAR PURPOSES OR NEEDS. THE OFFERINGS AND CROWDSTRIKE TOOLS ARE NOT FAULT-TOLERANT AND ARE NOT DESIGNED OR INTENDED FOR USE IN ANY HAZARDOUS ENVIRONMENT REQUIRING FAIL-SAFE PERFORMANCE OR OPERATION. NEITHER THE OFFERINGS NOR CROWDSTRIKE TOOLS ARE FOR USE IN THE OPERATION OF AIRCRAFT NAVIGATION, NUCLEAR FACILITIES, COMMUNICATION SYSTEMS, WEAPONS SYSTEMS, DIRECT OR INDIRECT LIFE-SUPPORT SYSTEMS, AIR TRAFFIC CONTROL, OR ANY APPLICATION OR INSTALLATION WHERE FAILURE COULD RESULT IN DEATH, SEVERE PHYSICAL INJURY, OR PROPERTY DAMAGE. Customer agrees that it is Customer’s responsibility to ensure safe use of an Offering and the CrowdStrike Tools in such applications and installations.[...]"

      The real question is why this software seemingly found its way onto so many critical systems in airports, hospitals, trains and others. This is a massive failure of IT decision makers on a global scale.

      1. Doctor Syntax Silver badge

        Re: Wrong question

        There's going to be a lot of examination to determine what laws can be applied and exactly what hey allow.

      2. Claptrap314 Silver badge

        Re: Wrong question

        I wonder if the Windows T&Cs don't have the same language. If they don't, they should.

        1. that one in the corner Silver badge

          Re: Wrong question

          > I wonder if the Windows T&Cs don't have the same language. If they don't, they should.

          What, you didn't read it? :-)

          These days, MS just go with the standard "it isn't fit for any purpose" wording:

          >> Microsoft and the device manufacturer and installer exclude all implied warranties and conditions, including those of merchantability, fitness for a particular purpose

          In days gone by, just about every bit of COTS software made you explicitly agree to such things as not running nuclear power plants or even using it to control an aircraft, heavy machinery or medical equipment[1].

          Nowadays, those explicit warnings seem to have gone Tubby bye-byes.

          However, the "we never promised it could anything at all" language has the same effect, it is just trying harder to hide in the safety of apparently innocuous words.

          [1] wish I could give URLs for this as well, but searching for the no nukes policy took digging through so, so many irrelevant hits. Was trying to get, e.g. the T&C's for Windows 2.0, they may be old enough to have the more explicit language, but so far, no dice.

      3. BenDwire Silver badge
        Facepalm

        Re: Wrong question

        FAILURE COULD RESULT IN DEATH, SEVERE PHYSICAL INJURY

        So why does the NHS use it then? There will be many people who have had operations cancelled, drugs go missing or treatment denied.

        Is no-one in charge of the IT systems actually aware of what can go wrong ?? Maybe they need an above inflation pay rise too ...

      4. Anonymous Coward
        Anonymous Coward

        Re: Wrong question

        > A better question is why was it even installed on critical systems? ... From Crowdstrike's own T&Cs, willingly accepted by every customer:

        Because those are the same disclaimers put into every single piece of Commercial-off-the-shelf software in existence. Including the OS you are running.

        If you have reached the stage of installing CrowdStrike, or PaintShop Pro or Pooh's Hunny Adventure, you have already accepted T&C's with the same level of disclaimer - in your OS!

        So CrowdStrike just repeating the same thing makes no difference, one way or the other.

        1. Joe W Silver badge

          Re: Wrong question

          And people complained when the discussion about IT supply chain safety and responsibility and accountability came up.

          Yup. That's a great example, isn't it?

      5. MrBanana

        Re: Wrong question

        "willingly accepted by every customer"?

        I think you mean "blindly accepted by every customer".

        I see that there are some companies, notably in the travel industry, who are passing on this disclaimer to their scewed over customers. Wasn't us, it was them, they don't have any responsibility so neither do we. Travel insurance policies don't want to cover this either.

      6. Andrew Scott Bronze badge

        Re: Wrong question

        They're bypassing the customers ability to ensure safe use of an offering.

    5. Optimaximal

      Re: Wrong question

      So are you suggesting all businesses write their own AV/Anti-Malware/EDR software now?

      As catastrophic as this was globally, this was a fairly simple problem - Ultimately, it's on Crowdstrike to look at their processes and software and fix them so the problem can never happen again.

  6. heyrick Silver badge

    Seems like some anti EU horseshit

    Given that the request was for third parties to have access to the same APIs that Microsoft use, not to have whoever running amok within the kernel.

    The alternative interpretation to their statement is even worse, that Microsoft is handing out the keys to the kingdom.

    1. Roland6 Silver badge

      Re: Seems like some anti EU horseshit

      The statement also has an implicit assumption that MS don’t send out updates that bork systems…

      Yes, it was a CloudStrike update that caused the problem, but MS also sends out kernel level updates….

      1. Optimaximal

        Re: Seems like some anti EU horseshit

        I think the implicit assumption is most credible businesses have some form of layered testing/production process for delivering Windows Updates in a staged manner. In addition to this, said updates have already been through Canary/Beta/Preview rings, so issues with general compatibility and performance have been identified.

        EDR/AV updates run on such a short timeframe from creation to deployment that this isn't feasible.

      2. heyrick Silver badge

        Re: Seems like some anti EU horseshit

        "an implicit assumption that MS don’t send out updates that bork systems…"

        That is a celestial body sized assumption.

        Less than a year ago: https://www.theregister.com/2023/08/24/windows_11_update_bsod/

  7. AMBxx Silver badge
    Boffin

    Am I missing something?

    I work with many companies of all shapes and sizes. Mostly UK & EU.

    I spend a lot of time connecting to servers over RDP. If I'm installing software, I always complain about AV as it slows stuff down.

    I have never seen Crowdstrike on a customers' server. None of my customers have been affected.

    How is this possible? Where is it?

    1. Khaptain Silver badge

      Re: Am I missing something?

      On a daily basis Crowdstrike is extremely unobtrusive and I have never heard any of our users complain about it slowing down their systems, we have been using it for around 8 months so YMMV.

      If you pull up the context menu, ie Right Click a file, you will see the option to analyze the file with Crowdstrike Falcon... It may depend on the configuration...

      1. Anonymous Coward
        Anonymous Coward

        Re: Am I missing something?

        We have a migration where the team taking over are insisting that crowdstrike MUST be on all systems and can not be removed.

        This turned a business unit who rarely had any IT issues in to somewhere with crap computers that barely work.

        The IT team who mandated that crap said this is normal behaviour... The users told them it's fucking not!

        1. gnasher729 Silver badge

          Re: Am I missing something?

          “ The IT team who mandated that crap said this is normal behaviour... The users told them it's fucking not!”

          I can see that IT team getting their a*** whooped.

        2. Khaptain Silver badge

          Re: Am I missing something?

          "The IT team who mandated that crap said this is normal behaviour... "

          Then the IT are not fit for the purpose. Something else is going on here that someone doesn't want to admit...

          So you are saying that all of the machines are behaving badly and no-one is doing anything about it. That's not an IT team, that's a bunch of cowboys.

    2. Anonymous Coward
      Anonymous Coward

      Re: Am I missing something?

      It seems to be a large fortune 5000 company thing.. but never heard of it either.

      The forums full of admins I'm in, some of whom admin stuff for pretty big companies, if they'd heard of it had never used it.

      Now of course everyone knows it's sh*t :p

      I'm the same, AV doesn't go on build servers for example as it can cause a massive slowdown. It's extremely rare for AV to fire anyway.. viruses, despite the AV vendors pleading, just aren't that common inside networks. Outside, sure, but you have defences against that.

    3. Dagg Silver badge

      Re: Am I missing something?

      I have never seen Crowdstrike on a customers' server.

      Similar, here in Australia someone is claiming that the Crowdstrike crash stopped them from opening their fridge?! I call this BS

      1/ Why the hell would you install Crowdstrike on a fridge?

      2/ Why install windows as the fridge OS?

      3/ Can Crowdstrike even run on the fridge OS?

      1. Optimaximal

        Re: Am I missing something?

        For some reason, Smart Fridges are a thing. Given how shoddily programmed most IOT products are, it's definitely possible that if some random internet-based service failed due to the Crowdstrike failure then the device also wouldn't handle the failure correctly.

        Of course, you'd likely still be able to open the fridge.

    4. Optimaximal

      Re: Am I missing something?

      We've always used Sophos, McAfee/Trellix and now Defender for Business. As a result, I've never seen Kapersky, Panda or Crowdstrike.

      Doesn't mean I don't think it or other products exist! :D

  8. Paul Crawford Silver badge

    TL;DR - don't blame our crappy system for getting in a boot loop, it was a big boy who did and ran away!

  9. Anonymous Coward
    Anonymous Coward

    Sorry, MS, but you're wrong...

    Watched Dave's video above....

    A good analogy for this would be an Uber driver renting out their license and account. Yes, the Uber driver with the account has been verified, but their mate, Trevor, is an absolute tool who can't drive.

    Microsoft could have very easily created a tonne of APIs that both itself and other companies could use, which would have avoided any anti competitiveness issues

    But, they didn't.

    1. TonyHoyle

      Re: Sorry, MS, but you're wrong...

      What Microsoft need to do in the future I think is to just forbid ring 0 stuff from doing this.. no WHQL if you try.

      Writing an architecture that hands off the complex stuff to a userspace service is harder but the consequences of failure is only that your service doesn't run, not that the entire kernel gets borked.

      Also, the rules for a driver claiming it's essential for booting need to be tightened up.

  10. Detective Emil
    Facepalm

    The First Law of Holes

    May I politely suggest, Microsoft, that you stop digging?

  11. Will Godfrey Silver badge

    Crap!

    I've never done any kernel work (and never want to), but it seems to me both parties are equally to blame on this.

    Cloudstrike were gaming the system, effectively sneaking in uncertified code. The certified code that this patch ran in had no mechanism to test for even an obvious defect, and it seems accepted a patch without any kind of embedded checksum. Finally, they hadn't performed comprehensive testing and they sent the patch out in bulk to everyone on the worst possible day of the week.

    Microsoft kernel code also seems to allow external modules to be changed without raising any red flags. At boot, do they not check for changes in third part modules? I would. The EU demanded that they allow third part code, not that they allow it without validation so that's a pretty obvious finger pointing excuse. They also appear to have no way of checking validity of calls from third parties, or ring-fencing anything that causes a serious problem,

    1. blackcat Silver badge

      Re: Crap!

      Add to this that MS certified a driver that not only pulled in code from user space but was also a required driver for the boot process. If there was a chance it could get borked from an external influence then the core of windows should have been able to disable it.

      1. Optimaximal

        Re: Crap!

        All EDR is like this. If any AV was not flagged as essential for boot then malware would simply work to get itself loaded before the AV and shut it down.

        The fix here is at the production stage with vendors not putting out junk updates and handling them correct when accidents happen.

        1. blackcat Silver badge

          Re: Crap!

          It should have some sort of 'degraded mode' where the machine at least boots but all it can do is retrieve new updates. This appears to be the third time this year that there has been an issue with the crowdstrike client on various platforms.

    2. Doctor Syntax Silver badge

      Re: Crap!

      "Cloudstrike were gaming the system, effectively sneaking in uncertified code. The certified code that this patch ran in had no mechanism to test for even an obvious defect, and it seems accepted a patch without any kind of embedded checksum."

      Now that it's known this happens I'm sure every malware author out there is looking at the possibility of getting the Crowdstrike driver to accept their code.

  12. lockt-in

    Marketshare issue is the root cause of scale of problem, and solution.

    Microsoft remaining a monopoly in the business sector is the root cause of the scale of this problem, it will happen again and again. I read that this is the biggest outage ever, with close to 100 million computers taken out, tragic.

    Creating competition and some diversity is the the solution to the root cause of the scale of this problem, and it is simple for governments to acheive.

    Steps for soft start for this:

    1) Mandate use of genuinely open "Open Standards".

    ...1a) Open Standards not controlled or possible to be vetoed by Microsoft or related company.

    ...1b) Not to be confused with Open Source Software.

    2) Unequivocal commitment by Governments and regional districts.

    3) Unequivocal commitment of mandated adoption across all sectors of society, with an industry-only adoption not allowed.

    4) An Open Standards Board adoption is established with a broad charter and with joined-up government.

    Australia went FULLY metric in a short period, by 1981, it is well documented and shows it is not an unsurmountable with full government backing with teeth.

    1. Doctor Syntax Silver badge

      Re: Marketshare issue is the root cause of scale of problem, and solution.

      Getting governments together to do this is going to take forever.

      What would be more effective would be for a number of large customers get together and devise their own T&Cs including things such as no weasel clauses to evade liability, effective testing before release, no telemetry of customer data, acceptance of unannounced spot checks to ensure that the conditions are being adhered to etc. Failure to accept them would shut a vendor out of a large segment of the market. Tag it with the "infrastructure security" label and that segment would quickly include a lot of the government market as governments are starting to realise that infrastructure security matters.

      Vendors would react in different ways but those doing so prioritising their engineering departments over PR, marketing, legal, lobbying and beancounting would gain a first-mover advantage.

      1. Boris the Cockroach Silver badge
        Unhappy

        Re: Marketshare issue is the root cause of scale of problem, and solution.

        Quote

        "Vendors would react in different ways but those doing so prioritising their engineering departments over PR, marketing, legal, lobbying and beancounting would gain a first-mover advantage."

        and then be swiftly overtaken by those companies that use PR, marketing, and lobbying because those people making the purchase decisions are neither engineers or IT experts. and got some flashy and glossy publications saying 100% effective, cheaper than our rivals and definetly wont bork your systems... not until after you've cashed the cheque in the brown envelope and left the company.

  13. Mark #255

    A very measured "this wouldn't have happened on Linux"...

    Matthew Garrett on Mastodon:

    "Linux would have prevented this!" literally true because my former colleague KP Singh wrote a kernel security module that lets EDR implementations load ebpf into the kernel to monitor and act on security hooks and Crowdstrike now uses that rather than requiring its own kernel module that would otherwise absolutely have allowed this to happen, so everyone please say thank you to him

    1. Justthefacts Silver badge

      Re: A very measured "this wouldn't have happened on Linux"...

      Linux has had its own kernel bugs in precisely this area. Same thread:

      “That sounds great! We had several production Linux servers crashing just last year because of silent kernel memory corruption by the CS Falcon kernel module, so it's good to know this will cease to be an issue going forward.”

      1. Dan 55 Silver badge

        Re: A very measured "this wouldn't have happened on Linux"...

        I don't see any Linux kernel bug there, just Crowdstrike trashing kernel memory structures then the kernel crashing as a result.

        Now this doesn't happen on Linux because of eBPF. One of the eBPF checks is "programs dereferencing pointers without safety checks" which is exactly what happened on Windows.

        Windows really needs its own version of eBPF.

        1. James Turner

          Re: A very measured "this wouldn't have happened on Linux"...

          Which has existed for the last two years, as you'll see from the link you provided.

          Getting people to use it may be another thing.

          The Crowdstrike Linux kernel panic was from a version that _didn't_ use eBPF...

  14. Martin Howe

    Silly question, but why doesn't CrowdStrike create a system restore point every time it updates itself? Many program do this in their installer. There's your bootable "last known good" right there. Presumably there's a reason why this wasn't done or wouldn't have worked?

    1. Anonymous Coward
      Anonymous Coward

      Well, enough of the OS has to be allowed to boot before it has the ability to know what a "Restore Point" is and recover back to it.

      But CrowdStrike inserts itself very early on in the boot process, so when it dies there really isn't much of an OS in existence to cope with it.

      So even if you had created a Restore Point yourself on Thursday, good luck rolling back when you hit on Friday.

      > Many program do this in their installer.

      CrowdStrike isn't running an installer, it just grabs it's updates directly. It possibly isn't even updating any of the files that it's installer installed when the IT guy clicked "Install" - it just installed, quit the installer, ran, spotted there were new files on the server and grabbed them, well after the installer proper had exited.

      1. Anonymous Coward
        Anonymous Coward

        Our group doesn't rely on Windows restore points. We run Windows under a hypervisor with daily snapshots so we can always rollback to a previous full system state.

  15. Alien Doctor 1.1

    Back in the early naughties...

    I had an extremely thick, useful and easy to use book (I cannot remember from who, but I had a library of oreilly titles) called something like "Using the windows api". It was an absolute godsend for creating software to match native microsoft applications.

    1. Anonymous Coward
      Anonymous Coward

      Re: Back in the early naughties...

      If its the same one my dad used to write applications for windows, I can guarantee that it wasn't a complete set of API docs because you could never match Microsoft applications using their 'published' APIs.

      1. Alien Doctor 1.1

        Re: Back in the early naughties...

        Good point, just like any manufacturer only giving you the info to just fix the basics.

    2. that one in the corner Silver badge

      Re: Back in the early naughties...

      > . It was an absolute godsend for creating software to match native microsoft applications.

      Only if it came a good while after the book Undocumented Windows (or you had a copy of that as well) or you would not have known how to write text out and get the TABs interpreted correctly.

      And good luck matching Microsoft Office's use of MDI if you decide to use the *documented* MDI WndProc...

  16. Anonymous Coward
    Anonymous Coward

    Rule bending on a scale that exceeds my imagination !!!

    1.) The WHQL Driver reads its data from a non-WHQL source ... mainly to work around the slowness of certification vs update frequency desired.

    2.) The 'data' input is not vetted & filtered appropriately ... probably for speed & the old adage "We know what we are doing, there is NO risk"

    3.) Related to 2.) there is no default/fallthough 'data' values that allow the software to work without crashing the kernel.

    4.) No-one had crashed the system to see what happens ... worse case scenario [Should be tested ... ALWAYS].

    5.) Arrogance ... Pure and Simple. Doesn't matter how good you are someone/thing will always knock you down a peg ... just to prove a point !!!

    :)

  17. Anonymous Coward
    Boffin

    Was a 2009 agreement on interoperability to blame?

    NO, it was the underlying defective Operating System. That requires third party security solutions.

    Why is software like CrowdStrike permitted to run at such a low level, where a failure could spell disaster for the operating system?

    Else their would be a performance hit. Like Microsoft moving GDI into kernel space to speed-up Windows. The trade-offs being:

    • Stability: Running the graphics subsystem in kernel space meant that a crash in the graphics driver could potentially bring down the entire system. This was a trade-off between performance and stability.

    • Security: Kernel space operations have higher privileges, and bugs or vulnerabilities in the graphics subsystem could pose greater security risks.

    1. Anonymous Coward
      Anonymous Coward

      Re: Was a 2009 agreement on interoperability to blame?

      >> “Why is software like CrowdStrike permitted to run at such a low level, where a failure could spell disaster for the operating system?”

      > Else their would be a performance hit Like Microsoft moving GDI into kernel space

      No.

      Else the antimalware would not have the opportunity to spot *or* to contain malware that had inveigled itself into the system. Well, only malware that was (still) running as an unprivileged process and was hoping to get the User to agree to the escalation message box.

      CS is intended to stop the OS booting, if it thinks that is what'll stop the malware. You agree to that when you buy and install it.

      What you *DON'T* agree to is that they run such shoddy practices as executing based on the contents of a file without taking even the most rudimentary steps to sanity check its contents.

  18. wander

    Scapegoating EU

    It is not surprising that Microsoft would stoop to the low standards of blaming the European Union for the Crowdstrike Windows update debacle.

    What is disappointing and most disturbing is that a significant proportion of the American citizenry, including in companies and US government departments will automatically accept this nefarious charge from Microsoft without the company providing one scintilla of credible, independently verifiable evidence to support such arrogant claim.

    In the past, Microsoft has blamed every imaginable reason or entity as scapegoat for the many dozens of Windows update or bug fix critical failures, but much less catastrophic than the bricking of Crowdstrike update fix boondoggle reported in the Register just yesterday, that has very negative global repercussions for the company business and it's badly floundering reputation.

    Ethical policies and practices do not exist for much of US Tech industry and corporte environment in 2024.

    1. Anonymous Coward
      Anonymous Coward

      Re: Scapegoating EU

      The truly stupid thing is that the whole debacle is 100% on the shoulders of CrowdStrike[1] and MS could just say that.

      But, no, strike out against the EU - because why not leverage this into a way to convince people that they should get a monopoly back? After all, they can promise that MS are incapable of releasing an update that can stop your PC booting.

      [1] no matter how many commentards like blaming Windows for needing something like CrowdStrike - or any other antimalware, including MS's own - presumably in the ludicrous belief that an OS, any OS, can ever be otherwise invulnerable to malware "by design" - and still allow you to actually run your arbitrary choice of application software.

      1. Roland6 Silver badge

        Re: Scapegoating EU

        >” The truly stupid thing is that the whole debacle is 100% on the shoulders of CrowdStrike[1] and MS could just say that.”

        The trouble is that Windows ie. Microsoft, generates the blue screen rather than gracefully handling the exception; like booting into safe mode with networking ( to permit RDS / support mode access).

    2. naive

      Re: Scapegoating EU

      Scapegoating EU is insult on insult, it proofs IQ in USA is avalanching down a steep slope, the next generation of them will probably think a fire is something sent by the gods.

      It is 2024 now, the days that visiting a website could result in the website author changing the Windows kernel are still fresh in memory.

      Even receiving emails is kind of Russian roulette like adventure on windows in 2024. Ohhh wait, I get coffee since windows is updating.

      Those who wanted less Swiss cheese on their hard drive were forced to turn to companies like Crowd Strike, since the security department was empty at MS.

      Windows is what cars would be in 2024 in case the elder Henry Ford never would have gotten any competition: A black T-Ford.

  19. Peter 39

    it all started years ago

    For those too young to remember, there was NT. It was a cheap-and-cheerful implementation and not built with modular architecture. For efficiency (i.e. to avoid lots of expensive context switches), lots of stuff was built to run in the kernel.

    Fast forward a couple of years and hardware is a *lot* faster. Context switches are still costly but the hardware is so much faster that it doesn't matter so much.

    Unix-style systems had modularity from the start. Admittedly, sometimes those modules were placed in kernel-space for speed. But as hardware became faster they were moved out to userland - because they were architecturally separate. Sure - this needed some API tweaks etc but the original design supported it.

    Unfortunately the legacies of NT mean that can't be done with Windows, absent an entire redesign of the OS and everything that goes with it. That would break essentially every piece of Windows software so Microsoft has decided to not do it.

    "Cheap and cheerful". You get what you pay for.

  20. Kiers

    MICROKERNELS?

    Would a MicroKernel OS have been more robust?

    1. Teal Bee

      Re: MICROKERNELS?

      No. First because whatever supervisor runs in ring-0 also needs to grant the antivirus ring-0 access no matter what, or else a virus can cause the antivirus to crash and have unfettered OS access.

      That's the whole point of marking that driver as boot critical – you absolutely don't want the machine to keep functioning without it. The OS is irrelevant in this equation, the customer has already decided that protection is critical by installing an EDR, and the OS has no say in this matter.

      (And second, for the simple reason that microkernels are only useful for academic research and can't run any practical workload, be it benign software or malware.)

  21. b1k3rdude

    Er the issue effected non-EU located computers also, so this is one of the weakest exscuses from Micro$haft I have seen to date...

    1. Optimaximal

      The location of the computer has nothing to do with it. The EU is a big enough regulatory body that they can exert influence on multi-nationals. They basically told Microsoft 'you need to allow third parties to create security software with the same low-level system access as your first-party software does, otherwise you won't be able to sell Windows in the EU'. Microsoft responded by granting this access. Ultimately, this is just normal behaviour - the same behaviour that forced Microsoft to disconnect Internet Explorer from Windows and allowed Firefox & Chrome to thrive.

      The problem was *Crowdstrike* had a significatn process failure and their software driver doesn't conduct a sanity check on what it's running - these are the problem that needs fixing - but it's likely that Microsoft has an agreement with Crowdstrike that prevents them calling them out for this, so the only response available is to diss the regulatory body...

      1. Dan 55 Silver badge

        Had Microsoft responded by creating a documented API for security software instead of saying "have at the kernel like we do, we'll sign your 'driver'" we wouldn't be in this position today.

  22. JulieM Silver badge

    Only one fair solution

    The only fair solution is to require all Source Code of software to be fully accessible to end users and independent security researchers.

    I'm not saying allow everyone to make unlimited numbers of copies of software -- although withholding the Source Code never prevented that anyway.

    Any plagiarism would be obvious, because the plagiarist would also be required to publish their Source Code and so out themself.

  23. FIA Silver badge

    According to a report in the Wall Street Journal, a Microsoft spokesperson pointed to a 2009 undertaking by the IT giant with the European Commission as a reason why the Windows kernel was not as protected as that of the current Apple Mac operating system, for example.

    Basically "It's fine for us to use our undocumented kernel interfaces, as none of our developers ever make mistakes."

    Nice Try.

    :)

  24. Charlie Clark Silver badge

    Smoke bomb from Microsoft

    The EU forced us to give others the same ability to crash the kernel that we have. Boohoo!

  25. tiago.pelicari

    The update has not been fully tested. Period. If Crowdstrike possessed such a privilege then both companies should jointly test any updates before a rollout.

  26. sitta_europea Silver badge

    It's always seemed a bit strange to me to want to run protection software on the systems that you're trying to protect.

    Twenty years ago, for consumer stuff, where basically you only had the one system to make money from (sorry - to work with), I could sort of understand it.

    But for serious users (agriculture, banking, construction, defence, energy, food, government, health, ... you name it) nowadays it makes no sense to me.

  27. Zippy´s Sausage Factory
    Unhappy

    "However, nothing in that undertaking would have prevented Microsoft from creating an out-of-kernel API for it and other security vendors to use."

    In other words, Micros~1 couldn't be bothered putting the work in and is now blaming the EU. Seems about par for the course for them these days, nothing's ever their fault any more.

    1. Optimaximal

      That's hardly fair - Microsoft have tried many times to completely tear apart the Windows Kernel to improve it, but it ultimately ends up breaking compatibility with all the old software that glues together many of the worlds big corporations, so they're effectively forced to keep selling the same leaky sieve with a refreshed user interface.

      There's often no reason why Windows 10 & 11 should need to run old Windows 95/98/NT/XP software, but Microsoft made it so that we can.

  28. Anonymous Coward
    Anonymous Coward

    Never mind the OS Buy a BYD

    Even though that Pesky meglomaniac Musk decided to use it as an excuse to fulminate some hate on his hate platform (does he want to sell cars? Do people that hate a lot have money for big ticket items? Probably but not many) it is interesting to ponder how some pretty disasterous ICJ news was due to be made public the same day and which group could benefit from the distraction.

  29. nijam Silver badge

    > Did the EU force Microsoft to let third parties like CrowdStrike run riot in the Windows kernel...

    Why would they bother? It's not as if it's difficult for third parties to do without any help whatsoever.

  30. rlightbody

    If I was Microsoft, I'd be pushing their own Defender endpoint protection platform. Microsoft have more to lose if they accidentally crash millions of Windows servers!

    https://www.microsoft.com/en-gb/security/business/endpoint-security/microsoft-defender-business

    1. Optimaximal

      If they did that, they'd be accused of attempting to build their monopoly again, especially as it's intrisincally linked to 365 now.

    2. Roland6 Silver badge

      > Microsoft have more to lose if they accidentally crash millions of Windows servers!

      Too big to fail?

      > If I was Microsoft, I'd be pushing their own Defender endpoint protection platform.

      It’s part of 365, so many will probably have already moved across, as it ticks the box and removes a third-party supplier; all in line with MS’s embrace, extend, extinguish policy.

  31. MuleD

    Kernel Level Access who gets it and why

    Time to pull out the tin foil hat. If I were a Intelligence agency with unlimited funding and unlimited access to large companies and I wanted access to everything I would demand that some of the big players in the market allow Kernel level access to seemingly above board software producers and then take a page from the Bad Actor playbook and pivot through some of the vendors who have been given low level access.

  32. GeoffAnonymoosehead

    Microsoft have cheek don't they?

    So basically Micosoft are selling a product, that, out of the bag, is already inherently insecure.

    Think about that.

    So you have to either apply their own ropey solution to make it secure or buy a 3rd party bit of software to make it secure.

    WHY ISN'T IT SECURE FROM THE GET GO.

    What level of incompetence and arrogance does Microsoft have to blame someone for its own generated problem?

    If it could write a proper secure OS then there would be no need for 3rd party software.

    It is like Microsoft if sold boats, as soon as you got it home and floated it, it would be regularly developing holes and trying to sink, so you have to spend all your time plugging holes that spontaneously appear. It would also just sink with no warning occassionally anyway.

    How long would they be in business for?

    Why do we accept it? They have had 11 versions of doing this.

  33. Smartypantz

    Who needs enemies

    When you have friends like Crowdstrike? ;-)

  34. John Savard

    Not a valid complaint

    Neither Crowdstrike, Microsoft, nor the EU is to blame for the fact that Crowdstrike is an antivirus product, and, as such, needs to run at a low level in the operating system. The alternative would be for Microsoft to make the only antivirus software that can run on Windows.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like