back to article CrowdStrike's Falcon Sensor also linked to Linux kernel panics and crashes

CrowdStrike's now-infamous Falcon Sensor software, which last week led to widespread outages of Windows-powered computers, has also been linked to crashes of Linux machines. Red Hat in June warned its customers of a problem it described as a "kernel panic observed after booting 5.14.0-427.13.1.el9_4.x86_64 by falcon-sensor …

  1. ecofeco Silver badge
    Pint

    Well just great

    FUBAR all around!

  2. Gene Cash Silver badge

    Gluttons for punishment

    I would ask "I wonder how many customers they're going to lose because of this..."

    But then I remember the fact people still use Windows and realize that answer would be "zero, or maybe one or two"

    1. Dagg Silver badge

      Re: Gluttons for punishment

      As this article pointed out it also hit linux as well.

      The overall problem wasn't windows it was crowdstrike. So if you are using windows or linux and you have crowdstrike installed then you have a problem.

      There are alternatives to crowdstrike that will work on windows, whether they work on linux is another problem.

      1. Kobus Botes
        Stop

        Re: Gluttons for punishment

        @Dagg

        "So if you are using windows or linux... you have a problem".

        Wrong - from Crowdstrike's analysis: "Systems running Linux or macOS do not use Channel File 291 and were not impacted". And the linux kernel panics that had been caused by Crowdstrike earlier were seemingly fairly easy (trivial?) to resolve.

        The real problem (as many commenters, myself included, has said) is the fact that using Microsoft365 exclusively, creates a single point of failure for the company.

        My question is: Will the people responsible for thjs state affairs learn from this incident and stop putting all their eggs in one basket? Or will the response be "Lessons have been learned and they have promised that it will never happen again, so all's good and we'll continue as before"?

        I fear the latter will happen.

      2. Ben Tasker

        Re: Gluttons for punishment

        > As this article pointed out it also hit linux as well.

        That's not what the article says, only what the title implies.

        TFA says that the software has caused crashes in Linux previously, not that this latest issue hit Linux too.

        > So if you are using windows or linux and you have crowdstrike installed then you have a problem

        100% agreed, with Macs thrown in there for good measure. The issue wasn't the OS, it was the vendor deploy improperly tested signatures despite them being consumed by extremely privileged code

        1. 9Rune5

          Re: Gluttons for punishment

          TFA says that the software has caused crashes in Linux previously, not that this latest issue hit Linux too.

          If you are trying to say that Linus has since implemented kernel changes that would protect the kernel from whatever CrowdStrike did to it back then, then _maybe_ you have a point.

          Microsoft, over a decade ago, tried to limit the AV vendors so that the kernel could be protected better. Unfortunately the EU said 'njet'. That implies it would have been possible to protect against this. I.e. Linux could've been made immune.

          Many of us would love a sturdier kernel. Unfortunately, none of what you've said so far helps promote Linux over Windows.

          Random thought of the day: If you're running, say, an airport with hundreds of informational terminals, do you have to run CrowdStrike on _all_ of them? I'm thinking a 50-50 split might make sense, that way half the machines might stay up in case of a cyber attack or in case a botched update rolls your way.

          1. Richard 12 Silver badge

            Re: Gluttons for punishment

            What Microsoft tried to do back then was abuse their monopoly to create a new monopoly in Windows AV software by making it totally impossible for any other AV software to exist.

            Much like if GM decided your car could only accept GM branded oil.

            This is illegal in the USA and EU, and was therefore prevented.

            It is true that Microsoft having that monopoly would have avoided CrowdStrike from taking down so many computers, because in that world CrowdStrike didn't exist at all. Instead it would have been Defender, and as there's no competition whatsoever, nowt you can do about it.

            Extending a monopoly is unlawful for a reason.

          2. dr john

            Re: Gluttons for punishment

            You mention the airports...

            They probably don't have CrowdStrike - just realised how appropriate their name is - on ANY of their terminals. They are probably accessing a big system in the cloud of their own, which is interacting with several other big systems - say each airline, and other airports, who tell them when a plane has taken off to get their arrival times correct. The solution could be for each of these big systems to be duplicated on Windows cloud and a Linux cloud. But think what that would do to their costs, and transferring data between these two systems to keep both up to date and accurate. and which is the definitive one - does it transfer from Windows cloud to Linux cloud, or vice versa, or what?

          3. Anonymous Coward
            Anonymous Coward

            Re: Gluttons for punishment

            Or change your system architecture to use VDI so there are only a few servers to rollback rather than physically visiting hundreds of desktops. But that would shrink the size of the IT staff, which means a smaller budget and less power for IT. Microsoft's insecurity and craptitude is actually a good thing from the viewpoint of corporate power politics.

        2. Smartypantz

          Re: Gluttons for punishment

          Let me translate the issue for you: Laziness!

    2. hoola Silver badge

      Re: Gluttons for punishment

      This is about Linux, not Windows. Some thing a few days ago had many on here honking on about how Linux was invulnerable, the solution to everything is Linux.

      Perhaps, just perhaps people need to look beyond the OS and the wider issue....

      1. Adair Silver badge

        Re: Gluttons for punishment

        There are no panacea OS solutions, hopefully we can all agree on that. In this instance it is noteworthy which OS crashed and burned, and why. In this instance the related impact on Linux systems seems to have been relatively trivial, although also noteworthy.

        Lessons all round, for those who care to learn them.

        1. Syn3rg

          Re: Gluttons for punishment

          The lesson could be, "3rd-party software shouldn't run in Ring 0"

          1. bazza Silver badge

            Re: Gluttons for punishment

            Unfortunately there isn’t an OS out there that supports security software running on a ring that’s less privileged than the kernel and more privileged than user land.

            Even Linux.

            And when you look at eBPF in Linux and see what that’s doing instead of running the submitted code at ring 1 (or one ring out from the kernel) one has to consider the sanity of the designers.

            1. rajivdx

              Re: Gluttons for punishment

              Mmmm... Windows has a specific AV API that allows 3rd party AV code to run in less privileged levels and yet monitor the system for malware activity. However as Dave Plummer explains, CS explicitly worked around this by writing their code as a driver so it ran in ring 0, AND loading external invalidated code into the privileged executable environment. The first was bad, but the second takes the biscuit for incompetent and reckless programming.

  3. eldakka

    > But then I remember the fact people still use Windows and realize that answer would be "zero, or maybe one or two"

    My own organisation doesn't use Crowdstrike, but from many other comments I've seen, many (especially local government, state government) orgs use it for compliance/insurance reasons. And it appears that Crowdstrike is the biggest player in this particular market. Therefore many orgs have little choice in whether they use Crowdstrike or not, it seems to be about the only player in town who can tick a compliance checkbox - which in itself to me seems to be a significant issue.

    Of course, this may be a leg-up for other, smaller players ...

    1. Anonymous Coward
      Anonymous Coward

      There are other options but a lot of DMs drink the cloud first, cloud native kool aid and think that these types of cloud back end applications are infallible.

      Truth is that anything can get taken down whether by the elements, by malicious intent or by human error.

      Always have a contingency plan to run your business without computing resources because you never know when the feces will collide with the thermantidote.

    2. Ken Hagan Gold badge

      The need to be seen to be using software of this type, in order to get a tick in a box from a regulator or similar authority, is certainly an issue. Whether Crowdstrike will still be accepted as fit for that purpose after the dust settles on this affair, is another matter. For example, if CS actually go under then anyone running their now-unsupported software would be an immediate failure on the box-ticking front.

    3. bazza Silver badge

      If a business has been obliged by their insurer to use CrowdStrike, that could be an avenue for a business to seek redress. It could be argued that they’ve differed a loss akin to a cyber attack despite all prescribed measures being in place in accordance with the insurer’s requirements. The fact that the insurer has itself been the unwitting avenue of attack is not the company’s fault

      1. Richard 12 Silver badge

        It should also mean that the insurer cannot refuse to pay out, because it was the insurer's requirements that caused the incident.

        There will be a large amount of lawyering going on before the dust settles.

      2. bazza Silver badge

        For differed, read suffered. Damned autocorrect...

  4. Outcast!!!

    Should rebrand themselves as ClownStrike.

    1. Mage Silver badge
      Coffee/keyboard

      Re: ClownStrike.

      Or CrowShrike, or Crowdstricken

      The Shrike is also called the Butcher Bird. And a Murder of Crows is unfair. Seagulls (eg blackheaded gull, herring gull) seem worse than rooks. Magpies, though are unpleasant and thus solitary (unlike rooks, jackdaws and gulls) and will "take" small birds at a bird feeder.

      1. tiggity Silver badge

        Re: ClownStrike.

        Magpies are not necessarily solitary.

        Young non breeding magpies tend to form sociable groups (sociable birds, they are young and can learn from each other) - can also get adults joining them at certain times (e.g. Winter, when no real breeding focus & especially if group near that adults territory as can keep an eye on them, make it clear it is dominant in that area to reduce chance of breeding squabbles later. Can also have scenarios where some of adults offspring may be in that group.)

        Like many birds, magpies will often have communal roosts in Winter, again showing sociability.

        Magpies are good hunters, careful observation will show they pay close attention to other smaller birds that nest in their territory, and so will often successfully locate nests and raid for eggs or chicks. As Mage said they will on occasion take birds from feeders (a bit more opportunistic), though other corvids may also do that (personally seen a jackdaw do attacks on passerines at feeders). You will notice the birds at feeders will treat all corvids with caution, not just magpies.

  5. jeremya
    WTF?

    Why use Crowdstrike when you have SELinux?

    SELinux is the ultimate security product for Linux-based machines. It was developed by the National Security Agency (NSA) to secure Government computers. It is maintained by Red Hat and available on most distros.

    The SELinux philosophy is: if it's not explicitly allowed then it's blocked.

    SELinux has pretty good threat detection as well. Every blocked action is logged and available to log monitors to start reacting in seconds if not milliseconds

    CrowdStrike seems to have the philosophy of: letting everything run and if I hear of a problem I'll provide a fix

    SELinux needs you to understand your systems and processes and enable functionality only when required

    CrowdStrike lets you treat systems as black boxes and believe the man behind the curtain will make it all good.

    The downside of SELinux is that you must know about the processes and systems you are administering and be patient while developing an optimal configuration. Funny that! I thought that was a basic requirement for systems security administrators. Never mind. With CrowdStrike, the man behind the curtain will make it all good, so you can hire cheap helpdesk staff to set up your systems.

    1. ghp

      Re: Why use Crowdstrike when you have SELinux?

      Is there a Whinedoze version?

      1. jeremya
        Boffin

        Re: Why use Crowdstrike when you have SELinux?

        Windows NT (and hence all current windows systems) have something similar built in but mostly not used and nowhere near as strict.

        It comes in two flavours

        1 Discretionary Access Control (DAC) is the primary security model. It allows the owner of an object (like a file or a directory) to control access to it. The owner can decide who can access the object and what operations they can perform on it.

        2. Role-Based Access Control (RBAC) which restricts system access based on the roles of individual users within an enterprise.

        These are only really used in enterprise-managed systems where policies can set these on files and other resources.

        A fully locked down Windows system is possible but you'll probably only find it in Government high security machines.

        In contrast SELinix is quite common on internet facing systems

        1. Snapper

          Re: Why use Crowdstrike when you have SELinux?

          Ah, the famous 'fully locked down Windows system'!

          That's the one in the sealed underground room under the ocean floor, turned off and unplugged?

          Thought so.

    2. Dagg Silver badge

      Re: Why use Crowdstrike when you have SELinux?

      It was developed by the National Security Agency (NSA) to secure Government computers. It is maintained by Red Hat and available on most distros.

      AND SERIOUSLY YOU would trust ANYTHING from the NSA! Just one word BACKDOOR!

      1. FuzzyTheBear
        Mushroom

        Re: Why use Crowdstrike when you have SELinux?

        That is a legend. If there were a backdoor , it would have been found a hell of a long time ago.

        1. Anonymous Coward
          Anonymous Coward

          Re: Why use Crowdstrike when you have SELinux?

          "If there were a backdoor , it would have been found a hell of a long time ago."

          Absent open source code, I can't agree with that. If I were designing a software backdoor, that is a VERY high value asset, and as a result I'd make sure it was undetectable.

          1. Anonymous Coward
            Anonymous Coward

            Re: Why use Crowdstrike when you have SELinux?

            there is a chip on your board for that

            1. Androgynous Cow Herd
              Pint

              Re: Why use Crowdstrike when you have SELinux?

              Not many folks understand what "Trusted Platform" actually means...

              Have a beer!

      2. Dagg Silver badge
        Big Brother

        Re: Why use Crowdstrike when you have SELinux?

        I just love these down votes - looks like the church of linux actually trust the government... Doh!

    3. FunkyChicken

      Re: Why use Crowdstrike when you have SELinux?

      Why use Devicie when InTune exists? Why use Splunk when syslog exists?

      It’s all the same reason - because they offer an easy to deploy, easy to tune, easy to integrate product. If you have a very small or very simple monoculture environment then maybe SELinux or whatever free tool works just fine for you. For those of us who have to deal with complex, multi-vendor environments and systems, products like Falcon are often worth the trade off.

  6. sanmigueelbeer
    Coat

    And this, ladies & gentlemen, is how you DDoS the entire world.

    Several hacking groups would also like to thank CrowdStrike for the file (that caused the BSoD) -- It will be very handy.

    This (demonstration) makes Petya & WannaCry(pt) look "pedestrian".

    1. doublelayer Silver badge

      Re: And this, ladies & gentlemen, is how you DDoS the entire world.

      And all you have to do is get it to run at kernel-level permissions. If you have the kind of access needed to install this file to break a computer, you don't need it. If you have that access, you could obtain a similar, if not more severe, action just by deleting files at random until you are no longer able to delete files. That computer is not booting without a reinstall. No booting to recovery and deleting a file will fix it. The benefit to the hacking community, any section, is zero.

      1. Richard 12 Silver badge

        Re: And this, ladies & gentlemen, is how you DDoS the entire world.

        A rather large number of bitlocker keys are currently traversing the world.

        That's the real gift to the miscreants.

  7. Dostoevsky

    Yeah, this was fun...

    Just drove from San Francisco to Dallas in two days flat to make it home in time for work. There were no flights to be had once mine canceled, and the drive was faster. Yuck.

    1. David 132 Silver badge

      Re: Yeah, this was fun...

      Eeeeek. That’s not a “drive home”, that’s a road trip (albeit an involuntary one).

      I drove the 1100 miles from Palm Springs CA up to Portland OR once, stopping only for gas and bladder; not an experience I’m in a hurry to repeat.

      1. Dostoevsky

        Re: Yeah, this was fun...

        LOL, yes. I've been referring to it as my "impromptu Southwestern road trip."

      2. Anonymous Coward
        Anonymous Coward

        Re: Yeah, this was fun...

        "I drove the 1100 miles from Palm Springs CA up to Portland OR once, stopping only for gas and bladder; not an experience I’m in a hurry to repeat"

        Piss in the empty Gatorade bottles, and throw them from the window, then you'll only have gas stops.

    2. Anonymous Coward
      Anonymous Coward

      Re: Yeah, this was fun...

      Could have been worse - you could have been responsible for actually fixing the Windows systems that were down and thus part of the reason you couldn't fly!

    3. Androgynous Cow Herd

      Re: Yeah, this was fun...

      That sucks... Two days on the road and at the end of it you have to live in Dallas.

      1. herman Silver badge

        Re: Yeah, this was fun...

        Well, you know, Debbie Does Dallas - which kinda makes up for it.

      2. Dostoevsky

        Re: Yeah, this was fun...

        Oof! Actually, I'm fortunate to live a couple hours east of Dallas in the "Piney Woods." My car was parked at DFW, though, so I had to stop there first.

  8. Atomic Duetto

    “Kurtz therefore has the possibly unique and almost-certainly-unwanted distinction of having presided over two major global outage events caused by bad software updates.”

    “Once is happenstance. Twice is coincidence. Three times is enemy action”

    At the very least he should be on Oprah to explain himself.

    1. TReko Silver badge

      Kutz is famous for his "fire fast" mentality. In February this Crowdstrike had layoffs and moved most tech jobs to India.

      1. Anonymous Coward
        Anonymous Coward

        What a press release:

        CrowdStrike Significantly Invests in India Operations to Continue Protecting Businesses from Modern Cyber Attacks

        We’ll continue to invest in key regions like India to make the Falcon platform, the gold-standard of protection, available to every customer around the world.

        1. Denarius Silver badge
          Unhappy

          Move to India? haven't we heard that script before? Now waiting for AIX to crash and burn. Some of you commentards seem to miss the critical issue in the CrowdStrike software. No input validation. That used to be taught early in software design and programming classes.

          1. Richard 12 Silver badge

            No testing either.

            The failure rate appears to have been 100% of machines that downloaded it, so CrowdStrike clearly didn't even attempt to use it on a single machine internally before pushing it to the world.

            1. Will Godfrey Silver badge
              Facepalm

              On the other hand

              It's possible that the upgrade was corrupted in transmission... especially as there seems to have been no validity check at the receiving end.

  9. Dagg Silver badge
    Facepalm

    In some ways it is difficult for me to feel sorry for any of these companies that have lost money. I'm now retired but have spend decades in architect / consultant roles and the number of times I have discussed DR (disaster recovery) planning with my clients and have them turn around and say things like "oh that is going to cost" followed by "Ah, it will never happen".

    Ha Ha

    1. Anonymous Coward
      Anonymous Coward

      Good point

      It takes an event such as this to focus the minds of the Beancounters who control the company spending.

      I was doing a gig at a london company and we were discussing the costs of proper DR. Their main DC was in Canary Wharf and their backup in that bit barn next to the immigrant detention centre on the A$ near Heathrow. The date was 11th Sept 2001.

      The Beancounters baulked at cost of moving the main DC to somewhere without a huge target on its back until the first plane flew into the WTC.

      A lof of 'Oh Shit' remarks and that day, plans were put in place to move both DC's to less risky locations. I left the gig before the locations were finalised so I have no idea where they ended up but they were both going to be outside the M25.

      I'd expect that a lot of the same anal exams are going on right now all over the world. A lot of [cough][cough] consultants will make shed loads of money telling the 'C-level' suits what they want to hear but nothing will really change. If [insert product name here] is needed for compliance purposes then they are stuck with it for the foreseeable future and the next crowdstrike disaster will be still there waiting to rear up like a Phoenix and flames.

      1. Like a badger

        Re: Good point

        "It takes an event such as this to focus the minds of the Beancounters who control the company spending."

        Will it? The experience of almost every Bad Thing is that people describe it as an unpredictable black swan event, and therefore they couldn't have prepared, and needn't change the way they do things, other than issue the usual "your business is important to us, lessons have been learned" bilge.

        Now, if I had an airline business (or indeed any other business), and it was taken down and disrupted for several days by crappy software, then as soon as the dust had settled I'd instruct my CIO to come up with a range of options to excise Shitbag Software Corp's products from the software stack. But do you think any CEO anywhere is going to do that? I don't.

        1. theOtherJT Silver badge

          Re: Good point

          It seems to be a tragic reality that despite the fact that we are, as employees, often asked to write up disaster plans I have never known one to actually be fully implemented after the fact.

          I'm sure a lot of us will have done the "What if..." threat planning game.

          "What if I'm locked out of this device?" -> We have backups, taken nightly.

          "What if we can't retrieve the nightly backup?" -> We have off-site backups that are taken weekly.

          "What if we can't boot the device to restore the backup?" -> We have a backup-restore boot option available on the network.

          "What if the network is down?" -> Our configuration management will repair it.

          "What if the configuration management is down?" -> We have a backup of the config management master, taken weekly, that can be restored from a physical disk.

          so, now, everything has gone completely to shit, and what's the procedure?

          Step 1: Send an engineer to the server room with a copy of the config management master server backup disk. It will restore a working config management system.

          Step 2: Order the config management start to repair the network based on known physical states such as mac addresses and port numbers. If that fails - and the only way it can is hardware failure, and we keep spares - then when it fails we'll know where the physical failures took place because we know the physical location of every asset. It's recorded in the config management system.

          Step 3: Go replace the thing and go back to step 2.

          Step 4: We now have a working network and can reach all the physical hosts, including the backup host. If the backup host is not responding, order the config management system to rebuild the backup host and then reconnect it to the storage targets.

          Step 5: Now we have the backups back, attempt to reach the downed host (or potentially hosts) if you can't, then order the config management system to rebuild those hosts. Either everything is good now or we can't reach it so...

          Step 6: If the host is somewhere off network, ask the employee who's actually got hands on it to boot it into recovery mode, where it will reconnect to the VPN using the recovery certificate and now we're back to Step 5.

          See, none of that is hard to understand. This sort of thing ought to be standard procedure absolutely everywhere because hundreds of people like myself have played this exact "What if?" scenario through and thought out how to deal with every step in it hundreds of times over the last... pick a fucking number... years. We all know what we ought to do, and by this point there should be thousands of examples of it actually being done. This should not be hard at this point.

          How many people actually do it tho? Not. Fucking. Many. I've worked at precisely one place where we ran through that little thought experiment once a year and the entire team tried to think of ways that the process could fail and if we came up with any opened work tickets to try and fix them. Everywhere else I've worked has Done the "What if?" at some point, although usually only once and only because a new manager rocked up and wanted to see it done, but then none of them ever actually implemented any of the recommendations.

          Actual disaster recovery procedure always seems to end up as some ultra low priority that people love to talk about, but not actually do anything about.

          1. Anonymous Coward
            Anonymous Coward

            Re: Good point

            Part of the problem is also that most companies treat DR and BCM as something you just keep in a cabinet until you need it, instead of exercising it (although I think NIS2 and DORA now sensibly mandate this). They get some consultancy in that writes them a fat BCM guide (translated: takes a fat bible they created elsewhere and do a search & replace on the company name) that nobody even as much as reads, and only when the brown stuff starts to be unevenly distributed is discovered that that effort was a total waste of time - DR and BCM require short, pragmatic processes that are maintained.

            Exercising the scenarios first of all allows them to be debugged, and getting familiar with them reduces stress, and God knows you'll already have plenty stress when you have to activate such processes.

            I used to write test scenarios, and as it happens I wrote and trained a scenario that pretty much matches this about a decade ago..

            1. theOtherJT Silver badge

              Re: Good point

              Absolutely this. It helps that a proper DR process should be hooking things you already use all the time.

              We knew that our network-boot-to-repair process worked, because it was the same process with a few different settings passed to it that we used to deploy machines in the first place. We knew that the restore machine from backup process worked, because it involved shipping disk snapshot diffs around and it was the same process we used to migrate VM's from one host to another (Or was until we got live migration working) shut guest down on host A. Perform backup. Restore guest from backup on host B. Start guest on host B. Delete guest on host A. We new the "recover broken network" stuff worked, because we used it to build the network in the first place when we last took over a new office.

              All that stuff worked because we actually used it basically every day - but you try and convince management at most places that it's worth building an entire automated deployment system for your entire infrastructure because one day you might have to move building and "Wouldn't it be nice if we just plugged everything in, updated a few config files, and then it all deploys itself?" when writing that system will take months, but doing everything by hand takes days and you only move building once a decade.

              This is the difference between a company that is well managed and one that isn't. People in senior positions who understand that by investing in systems before you need them means you can sit around being smug when things do go wrong rather than running around like someone just set fire to your shirt. It also means that you can take staff time away from tedious tasks like deploying new desktops - because that's now a completely automated process - and put them on useful tasks like actually performing tests on the DR service and working out how to continuously improve it.

              It is something of a personal bug-bear of mine that so few places are prepared to invest in this sort of thing because having experienced the difference between working at a place that has it and a bunch of places that don't is night and day. Not only was my life so much less stressful, my day-to-day work was more interesting and overall the whole thing was cheaper for the business because the fact that all these processes actually worked meant that we were able to run a 1000+ employee company on five members of support staff because everything was so highly automated.

      2. Denarius Silver badge

        Re: Good point

        right to be skeptical about manglement, but wasn't planning for "Black Swan" events a thing a decade ago ?

        1. Anonymous Coward
          Anonymous Coward

          Re: Good point

          Aren't they all owned in the UK by the Royal Family?

          No, wait, those are the white ones, my bad.

          :)

    2. hoola Silver badge

      With that expertise then you will understand that before any DR recovery you have to understand what is causing the issue and stop that from wiping out you recovered systems.

      In this case until there was absolutely no chance os CrowdStrike promptly updating the recovered systems (one it was identified as the cause), you can do nothing.

      The scale of the debacle also makes it very difficult to leave affected stuff in place.

      There is so much reliance on tech it is almost impossible to implement a manual fallback that can operate at the same capacity as the broken system.

      Airline baggage being a perfect example.

      1. Anonymous Coward
        Anonymous Coward

        No chance

        – What happen ?

        — Somebody set up us the bomb.

        - How are you gentlemen !!

        - All your base are belong to us.

        - You are on the way to destruction!

        – What you say !!

        - You have no chance to survive make your time

        – Take off every 'Reboot'.!!

        – You know what you doing.

        – Move 'Reboot'.

        – For great justice.

        [never gets old…!]

      2. Dagg Silver badge

        you will understand that before any DR recovery you have to understand what is causing the issue and stop that from wiping out you recovered systems

        You actually need to stop thinking in the IT space DR MUST also cover the manual processes that will need to be used while the IT systems are down.

        Little things like:

        * Manual sales

        * Manual stock xfers

        * etc

        These would be captured on paper (remember what that is) and then can be keyed into the systems once they return. Part of DR planning is doing things like printing out the nightly inventory reports so that these can be used while the system is toast.

  10. Groo The Wanderer Silver badge

    In this case, I blame management's continued incompetence, no lessons having been learned the first time. In fact, I'd have to wonder if the second occurrence might not be malicious...

  11. HeIsNoOne

    "We're in the process of operationalizing an opt-in to this technique"

    "operationalizing" is a word now? My manager-speak must be rusty.

    1. Phil O'Sophical Silver badge

      Re: "We're in the process of operationalizing an opt-in to this technique"

      If they write their code in the same convoluted way no wonder it's crap!

    2. Joe W Silver badge

      Re: "We're in the process of operationalizing an opt-in to this technique"

      Remeber folks: "verbing weirds language".

      (B Watterson)

      1. CRConrad Bronze badge

        Re: B Watterson

        Aha, so it was from C & H. I'd forgotten that.

  12. This post has been deleted by its author

  13. glennsills@gmail.com

    The problem is operational

    The issue is not the operating system. If you have software that must be continuously available, you cannot blindly trust updates to the OS, any OS, from the vendor. Regardless of whether the OS comes from Red Hat, Microsoft, or whoever, update needs to be tested outside production before you roll it out. This can be expensive since security updates can be frequent but if you want your software to be continuously available without fail there is no way around it. After all, even if you can trust the vendor not to break the OS (clearly you can't) you certainly cannot trust the vendor not to break any application software that the vendor doesn't know about.

    1. Adair Silver badge

      Re: The problem is operational

      Having said that, wouldn't it be useful/sensible, essential even, to have a frontline OS that doesn't throw itself down on its back, feet waving helplessly in the air, before melting into a gruesome irrecoverable puddle, just because some third party code isn't up to snuff (or even first party code, if it comes to that)?

      IOW, an OS that, having fallen over, actually has the means to pick itself back up via its digital bootstraps and fallback into the most recent 'usable' version of itself.

      1. doublelayer Silver badge

        Re: The problem is operational

        Often, it is considered the OS's job to execute the software provided, and if you've chosen to let that software run at kernel level because you want it to have access to everything, that means it can mess up the kernel. An operating system that allows you to install software at that level is not compatible with one that can prevent errors executed at that level from having deleterious effects.

        So we move on to your next suggestion, which is more plausible, of automatic recovery. That one can work. Have a versioned filesystem, and whenever you have a kernel panic, rewind to an older version and boot that. Of course, if the panic happened because some hardware failure triggered a kernel bug, then you'll end up rewinding yourself to the earliest version available as it panics every time, and it might provide a method for an attacker to remove recent updates in order to reactivate a vulnerability, but in principle the idea would work and those additional dangers could be mitigated by other protections. We would have to figure out what those protections should be and design them, but your second suggestion is possible.

      2. Dagg Silver badge

        Re: The problem is operational

        Again stop just considering the OS or even other parts of the software. Problem areas like long term power loss such that the UPS's run out. Physical damage to the data centres etc. These are operational issues that will cause problems.

        1. Anonymous Coward
          Anonymous Coward

          Re: The problem is operational

          Basically very basic BCM..

  14. abend0c4 Silver badge

    CrowdStrike on Sunday teased a rapid recovery tool

    Automatic closers are of little use on stable doors.

    1. This post has been deleted by its author

  15. that one in the corner Silver badge

    The occurrence of kernel panics mere weeks

    > before CrowdStrike broke many Windows implementations therefore hints at wider issues at the security vendor.

    That they accidentally released the Linux borkage weeks ahead of schedule, before it was ready to really screw things up. After that, they continued as planned to really knacker a load of Windows boxes, but had to hold back the completed Linux version "because the sysops were already on their toes and wouldn't be so easily caught out".

    The CrowdStrike C-suite were annoyed at the resulting partial success, but the CFO pointed out it gave them a bit more time to practice holding pinkies to mouths and getting the correct "eee" in "beeelions".

  16. Anonymous Coward
    Anonymous Coward

    Outsourced to India Feb 2024

    I'm not saying it is related but a lot of companies are doing this, at least it can increase risk whilst transitioning.

    If all IT is offshore or in Public Cloud then you lose control to some extent.

    https://www.crowdstrike.com/press-releases/crowdstrike-invests-in-india-operations-to-continue-protecting-businesses-from-modern-cyberattacks/

  17. MSArm

    Hilarious

    Now how do all the Muppets feel who smugly told people to use Linux?

    Linux is good but Linux fanboys who derride and scoff at Windows or Mac users really are pillocks of the first degree.

    1. Ken Hagan Gold badge

      Re: Hilarious

      Looking at the wipeout of a few Windows-based enterprises and contrasting with a few Linux boxes that went down but were easy to bring back up again, I expect the muppets are still feeling pretty fucking smug.

    2. Anonymous Coward
      Anonymous Coward

      Re: Hilarious

      I'd say a difference of about a 1000:1 in failure rates is still plenty reason to actually remain smug..

      :)

  18. shawn.grinter

    Bad procedures

    Who in their right minds let’s a third party directly patch a Production/Customer facing system?

    1. Anonymous Coward
      Anonymous Coward

      Re: Bad procedures

      But but cloud! And AI! And blockchain! And the next buzzword-de-jour!

      How can I possibly justify my remuneration being a thousand times that of the peons if I don't force pointless, destructive changes on the business to chase the latest fad?

  19. SunnyS

    The nine most terrifying words in IT are: I'm from the CrowdStrike, and I'm here to help.

  20. sitta_europea Silver badge

    "Former Microsoft operating system developer David Plummer has shared his dissection of the flawed CrowdStrike update HERE."

    [My emphasis.]

    "Please update your browser."

    [Emphasis hardly necessary.]

  21. t245t Silver badge

    CrowdStrike panics Linux kernel :o

    Why would anyone in their right mind even run Anti Virus software on their Linux box?

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like