back to article 'Beyond stupid': Linus Torvalds trashes 5.8 Linux kernel patch over opt-in Intel CPU bug mitigation

Linus Torvalds has removed a patch in the next release of the Linux kernel intended to provide additional opt-in mitigation of attacks against the L1 data (L1D) CPU cache. The patch from AWS engineer Balbir Singh was to provide "an opt-in (prctl driven) mechanism to flush the L1D cache on context switch. The goal is to allow …

  1. devTrail

    What kind of opt-in was it?

    I didn't get the description of the opt-in. From the article it seems that it is not a choice of the system administrator, but set by some software. I hope I misunderstood it, otherwise it would be a terrible solution.

    1. Anonymous Coward
      Anonymous Coward

      @devTrail - Re: What kind of opt-in was it?

      You're close! It can be set by any brilliant or dumb developer who chooses to. I can hardly wait to see what malware creators can do with it.

      1. Yet Another Anonymous coward Silver badge

        Re: @devTrail - What kind of opt-in was it?

        And it will become a requirement for all software. After all you can't compromise 'security', so all corporate standards will require a flush after every function call

        1. sev.monster Silver badge

          Re: @devTrail - What kind of opt-in was it?

          Why not just turn off all L* caching and memory completely? Then no one can even know what instructions you're running, not even the CPU.

          1. Anonymous Coward
            Anonymous Coward

            Re: @devTrail - What kind of opt-in was it?

            Then what runs the actual instructions, which by default requires knowledge of what they are? The Chopped Pickles Unit?

          2. Steve Knox
            Trollface

            Re: @devTrail - What kind of opt-in was it?

            Here -- you forgot this:

          3. fobobob

            Re: @devTrail - What kind of opt-in was it?

            Need a multi-GHz 64-bit analog to the i8080 - no out-of-order anything, no pipelining, no nothing.

            1. Charles 9

              Re: @devTrail - What kind of opt-in was it?

              The tri-core POWER-based CPU in the Xbox 360 was relatively simple, and it was noted to be somewhat of a lightweight compared to the Cell CPU of the PS3 (both ran at 3.2GHz IIRC).

              Seems to me simple isn't going to cut it with modern workloads; its versatility will be too limited.

        2. J27

          Re: @devTrail - What kind of opt-in was it?

          Probably not, it probably means that almost no one will "opt-in". "opt-in" options are great way to ensure no one uses a feature.

    2. Anonymous Coward
      Anonymous Coward

      Re: What kind of opt-in was it?

      Yeah. Opt in as a certain distribution that tries to flush caches as a mitigation *all the time* seems ok if the user/admin can switch it on for special sensitive systems (if your air gapped/trusting your software, you many not care for the mitigation to be there and lose performance).

      But if the program flushes it in a not so helpful way, then I can agree with the experts this is less than ideal.

      I think there is an option to flush some cache or use special types of execution in Windows for some of the spectre mitigations. So that passwords etc are never cached/susceptible. But I don't know if you could use the same code to try and slow down a Windows OS that way too?

  2. Anonymous Coward
    Anonymous Coward

    funny security

    Funny as every sec. engineer, back in the 90s, didn't trust VLANs to be secure enough to be able to separate different security zones ...

    It turned out, they were, and no compromise was ever shown ... Now, no-one would even require physical LAN security ...

    Now, in the realm of CPUs, it turns out everyone is fighting side-channel CPU attacks on consolidated workloads, because Intel decided to compromise on security.

    Interesting times.

    1. Bitsminer Silver badge

      Re: funny security

      Unpull this comment, please!

      VLANs do leak: multicast data, vlan mgmt frames, etc. Especially from vlan #0. Yes, that brand.

      And I know of at least one TLA that insists on separate physical switches not VLAN-encumbered "efficient" solutions.

    2. Tomato42
      Boffin

      Re: funny security

      oh you haven't heard about attacks that exploit the limited size of MAC table?

    3. moronatwork

      Re: funny security

      Someone has never heard of VLAN hopping and other VLAN shenanigans? You know, an easy google search would have educated yourself.

    4. Paul 33

      Re: funny security

      Not just TLA's requiring airgapping between physical networks.

      1. Anonymous Coward
        Anonymous Coward

        Re: funny security

        I doont wanna hav airgapp wenn beeynk fysical.

    5. LeoP
      Stop

      Re: funny security

      Interesting. While we are not a TLA, we do insist on physically seperated management segments. I do think to remember the double-tagging attack to work quite fine on a lot of colo switches.

  3. DemeterLast
    Stop

    git broke English

    "for now I'm unpulling it"

    I like git a lot, but if 'unpulling' escapes into regular use I'm going to track down the git developers and huck rocks at their houses.

    1. S4qFBxkFFg

      Re: git broke English

      Don't you mean "unretain rocks at their houses"?

      1. Anonymous Coward
        Anonymous Coward

        It's 'uncatch', surely.

        I'd get my coat, but -->

      2. Jim Mitchell

        Re: git broke English

        Is "huck" equivalent to "push" in Gitish? (not an expert git speaker, myself)

        1. Tom 7

          Re: git broke English

          "Huck" sounds like the noise you make when you want to swear but your stupidity means you would spit teeth and blood doing so.

        2. John Gamble
          Headmaster

          Re: git broke English

          "Is "huck" equivalent to "push" in Gitish? (not an expert git speaker, myself)"

          I suspect it's a typo, and they meant to type "chuck".

          (And I nearly typed "chick", so the typo demon is clearly about.)

          1. Youngone Silver badge

            Re: git broke English

            "Huck" is a term we used when I was a kid growing up in the Antipodes doing stupid things with fireworks.

            As in "I'm going to huck this double happy on Mr. Waldmann's roof".

            Good times.

            1. Barry Rueger

              Re: git broke English

              "Huck" is a term we used when I was a kid growing up in the Antipodes doing stupid things with fireworks.

              "Huck" was common parlance in Canada as well, which suggests it may have had British roots?

              1. eldakka
                Boffin

                Re: git broke English

                Both "huck" and "chuck" were used in Australia, though I think 'chuck' is more common.

                Interestingly, WikiDiff's article on the differences lists a more extensive set of meanings for "chuck" vs "huck", and note that one of the meanings of "huck" is:

                Verb

                (informal) to throw or chuck

                Which implies to me that since chuck has many more potential meanings, and that one of the meanings of huck is chuck, that "huck" derived from chuck by just dropping the 'c' to have a word that has a more specific subset of "chuck".

                e.g. chuck steak (steak from the shoulder), chuck steak (throw/toss some steak)

                Whereas "huck steak" really has only one meaning (I think, IANALinguist), to throw some steak.

                1. Anonymous Coward
                  Anonymous Coward

                  Re: git broke English

                  hurl it in my tinfoil blackhat

                2. sev.monster Silver badge
                  Windows

                  Re: git broke English

                  In the deep wood of the US, hucking can mean carrying, usually over the shoulder. By deep wood I mean this usage is chiefly yeehaw. See also ruck.

                  There is a lack of a down-home dumb-as-rocks countryboy icon so I chose the closest approximation.

    2. Saruman the White Silver badge
      Facepalm

      Re: git broke English

      I really, really hate to break this to you, but Linus is the original developer of GIT (although I suspect that other people have contributed to it). So whatever you do, please ensure that whatever rocks you through at Linus' house are very small.

      1. Someone Else Silver badge

        Re: git broke English

        ...cuz after all, houses of glass don't play nice with rocks of any size....

        1. Anonymous Coward
          Anonymous Coward

          Re: git broke English

          Depends on the glass. A glass house made of bank window glass or designed in an area known for frequent hailstorms are likely built to a higher standard.

    3. Jamie Jones Silver badge
      Happy

      Re: git broke English

      Yes. unpulled means an absense of pulling.

      every fool knows is should be de-pulling!!!!

      we're unfriends -> "We've never been friends"

      he defriended me - we were friends but he's canceled our friendship!

      Can't people even make up words correctly?

      1. sev.monster Silver badge

        Re: git broke English

        You intended to say that unpull is ungood, yes?

        1. Anonymous Coward
          Angel

          Re: git broke English

          I think he would (should) consider it de-good.

          1. Claptrap314 Silver badge

            Re: git broke English

            No, it was never good in the first place. That's the point!

            1. Jamie Jones Silver badge

              Re: git broke English

              Yes! Exactly! un-good!

      2. Michael Hoffmann Silver badge
        Big Brother

        Re: git broke English

        So, do we also get double-plus-pull and double-plus-unpull?

        (obvious icon)

      3. heyrick Silver badge

        Re: git broke English

        Strange. It would make more sense to have "he unfriended me" meaning that he was once a friend and now he isn't, and "he defriended" me meaning "thanks to him, I have fewer friends".

        1. Jamie Jones Silver badge
          Happy

          Re: git broke English

          Hmmm, I see you are using "de" as "to reduce" rather than "to remove", as in "devalue"

          In that case, I'd use "defriend" in both cases.

          "Un" is the absense of something.

          "De" is to remove something (and also, I concede, to reduce something)

          Unknown, unavailable, undetermined etc.

          deescalate, derobe, deacidfy, deactivate, ...

          Now the fact that you'll be able to find just as many examples that contradict what I wrote above, will be conveniently ignored! :-)

          I note FreeBSD "deinstalls" packages rather than "uninstalling" them. So... erm QED. Checkmate, scientists!

          1. Charles 9

            Re: git broke English

            What about undo?

            Seems to be "un" can be a bit broader: not just the absence of something (as part of an adjective) but also to create that absence (as part of a verb). Meaning uninstalling, undressing, etc. make sense as you're removing (creating the absence) of the installation, clothes, etc.

            1. Jamie Jones Silver badge
              Happy

              Re: git broke English

              As I said, "Now the fact that you'll be able to find just as many examples that contradict what I wrote above, will be conveniently ignored! :-)"

              So.. um... *silence*

      4. Androgynous Cupboard Silver badge

        Re: git broke English

        No no no, unpull clearly means "push". "git push" coming up Linus, thanks for the endorsement!

        1. Charles 9

          Re: git broke English

          Does it? Or does it simply mean to stop pulling (create the absence of pulling, IOW)? Then you have dispull, depull, and so on...

    4. TrumpSlurp the Troll
      Unhappy

      Re: git broke English

      Painful to recall the times a partner has informed me that I have unpulled.

  4. Drew Scriver

    El Reg faux pas

    The stock photo El Reg picked for this article is a rather poor choice, me thinks.

    To borrow some verbiage from the article's headline: "Beyond stupid".

    1. maffski

      Re: El Reg faux pas

      You're right. Entirely inappropriate.

      Finland doesn't have school buses so why would one be on the blackboard?

      1. John Brown (no body) Silver badge

        Re: El Reg faux pas

        "Finland doesn't have school buses so why would one be on the blackboard?"

        Because it's a Google Captcha and everyone knows that Google thi9nks the whole world can solve US based street scene Captcha 'cos we all use the term "crosswalk" and school buses are ALWAYS yellow.

        1. Claverhouse Silver badge

          Re: El Reg faux pas

          And everyone is the world is familiar with 'Fire Hydrants' * and has a deep fascination with traffic lights.

          .

          .

          * So delightfully 1920s, with winsome dead-end kids frolicking in the spray. **

          .

          * * During the hot summer months rather than a NY January presumably.

          1. Jamie Jones Silver badge

            Re: El Reg faux pas

            And everyone is the world is familiar with 'Fire Hydrants' * and has a deep fascination with traffic lights

            That reminds me of a non-USA centric moan:

            Traffic lights... Do they mean just the bulbs/globes? Or also the box the lights are in? And what about the poles the lights are mounted on?

    2. Totally not a Cylon
      Headmaster

      Re: El Reg faux pas

      Maybe she's telling him off for answering all the questions first.

      He's got to give the other kids a chance......

      And this is the Register.........

    3. Anonymous Coward
      Anonymous Coward

      Re: El Reg faux pas

      Well it's just been changed, but what was inappropriate about the original?

      1. Drew Scriver

        Re: El Reg faux pas

        Original page:

        https://web.archive.org/web/20200602123220/https://www.theregister.com/2020/06/02/linus_torvalds_unpulls_kernel_58/

        1. Anonymous Coward
          Anonymous Coward

          Re: El Reg faux pas

          Thanks, though I had actually seen the original, but I still don't see what was inappropriate about it!

  5. a_yank_lurker

    Real Fix

    The real fix would be for Chipzilla to get their act together as this issue was caused by their blunders. Kernel level fixes are at best a kludge and are likely to be a source of some nasty unintentional bugs.

    1. sev.monster Silver badge
      Mushroom

      Re: Real Fix

      Bugs and terrible, terrible performance.

      Can we just nuke microcode and go all-in on RISC?

      1. bazza Silver badge

        Re: Real Fix

        Hooray, that sounds like a return to PowerPC! (The risc bit)

      2. Charles 9

        Re: Real Fix

        NO, because we still need high performance. We can BS around a wrong answer, but we can't BS around a missed deadline.

      3. Anonymous Coward
        Anonymous Coward

        Re: Real Fix

        So by the same logic would you ban all high-level languages and insist that everything is done in machine code?

        1. Anonymous Coward
          Anonymous Coward

          Re: Real Fix

          Everything is done in machine code. We just load it into the machine.

          1. Anonymous Coward
            Anonymous Coward

            Re: Real Fix

            The tape needs a patch, Marv.

        2. sev.monster Silver badge

          Re: Real Fix

          I don't really feel the scope of the two situations are close enough for that to make sense.

    2. Roo
      Windows

      Re: Real Fix

      Kinda sad seeing a key Intel customer, AWS, flounder around trying to fix Intel's bugs instead of leaning on Intel to actually fix them. AWS are between a rock and a hard place - they either replace the hardware - with the unavoidable chance that the new hardware will also be broken, or they implement performance killing hacks on their heavily utilized shared boxes... Gee, maybe putting all your eggs in one basket was a dumb idea after all...

      1. Gordon 10
        FAIL

        Re: Real Fix

        Hmmm - AWS may be primarily intel (along with everyone else) but they have AMD and ARM and weird FPGA's as well. So fail icon.

        All our stuff moves to Graviton instances in 2 weekends....

        1. Roo
          Windows

          Re: Real Fix

          The basket is a bit ambiguous, I was thinking of mixing customers on the same physical box - rather than AWS being an Intel only shop (which I know they are not).

  6. Anonymous Coward
    Childcatcher

    There must be a simpler fix...

    Why not have a key be generated at the start of a program where as the context switches, just encrypt the cache so that it can't be read without the key.

    Then as the context is switched in, the memory is decrypted. The OS would manage the single use keys for the applications There are some additional layers to this but the idea is to make it more difficult and expensive to get at a program.

    Yes, there will be a performance hit but at least your system is secure. (Until someone breaks your key management system within the OS)

    1. Claptrap314 Silver badge

      Re: There must be a simpler fix...

      Cheaper, faster, easier, and more secure to just turn the cache off.

      Encrypting is THAT slow.

    2. theblackhand

      Re: There must be a simpler fix...

      Your fix is changing the target of the attack from application space (i.e. browser or ash session keys) to the kernel - deduce the code encryption keys at the kernel via a timing attack and you're back to the original issue

    3. Anonymous Coward Silver badge
      Facepalm

      Re: There must be a simpler fix...

      So instead of writing a series of zeroes to the cache, you intend to read the cache, run it through an encryption algorithm and write it back.

      Which do you think would be quicker? And even the quick solution is too slow for context switching.

    4. FeepingCreature Bronze badge

      Re: There must be a simpler fix...

      Xor cache address input with a per-process key? Xor should be fast...

      This would somewhat degrade cache performance as the switched-to process overwrites the addresses of the switched-from process, but two processes shouldn't share all that much cache anyways. And if we switch back quickly, a lot of the cache should still be intact.

      This seems equivalent to per-address layout randomization (with 64 bits).

      Though if we're screwing with the silicon anyway, might as well tag cache entries per process.

    5. Anonymous Coward
      Anonymous Coward

      Re: There must be a simpler fix...

      AFAIK AMD is working on full encryption across the CPU. The problem is, that if the software has access at all, it's just a waiting game before you can decode the "key". As I don't think 256bit AES encryption is going ot be blazingly fast on the CPU for every bit execution. But some simple encryption in memory does mitigate it in part.

  7. Will Godfrey Silver badge
    Alien

    Reminds me of...

    If I be goin' there, I be-n't start from here.

  8. Claptrap314 Silver badge

    And here I thought Amazon would do the right thing

    Silly me.

    To review:

    1) Spectre-class bugs CANNOT be mitigated in current hardware.

    2) The entire point of caches is to speed process execution. Therefore any process with fine-grained access to the clock is going to be able to derive information about the addresses of data held in the cache. With Spectre-class attacks, one can derive information about the contents of the data.

    The only way around this is to ensure that all code running inside the same cache has the same security context.

    So, for Amazon, you are sharing your data with everyone else on the box. Dedicated boxes are required for anything handling PII.

  9. Henry Wertz 1 Gold badge

    Timing?

    I wonder about a different approach... a vast majority of these attacks rely on access to a high accuracy timer to measure time between cache misses and hits. I wonder if it would cause any major issues to simply limit user space access to timers (and I guess Linux jiffy counter?) to like 1/10,000 of a second accuracy or so, instead of the nanosecond accuracy it is now. That's what the web browsers did recently; Javascript on chrome and firefox would actually be JIT (Just In Time) compiled and run fast enough for some of these attacks to work; they simply made the Javascript time functions round off their results a bit.

    1. martinusher Silver badge

      Re: Timing?

      > I wonder if it would cause any major issues to simply limit user space access to timers (and I guess Linux jiffy counter?) to like 1/10,000 of a second accuracy or so, instead of the nanosecond accuracy it is now.

      There's a whole world out there of high frequency stock trading that seems to require high precision timing for everything, including network transit times and clock coordination accuracy in the uub-microsecond precision range.

      1. GrumpenKraut

        Re: Timing?

        Also benchmarking/profiling code. Also measurements including timestamp. Fscking up time info means trouble for a whole lot of things.

        Linus is right, no surprise here.

  10. eldakka

    I'm probably misunderstanding this:

    Singh replied: "I am not so sure. A user can host multiple tasks and if one of them was compromised, it would be bad to let it allow the leak to happen. For example if the plugin in a browser could leak a security key of a secure session, that would be bad."

    But as a user, I can run a debugger or dtrace or something and read the memory of any process running under my userid.

    Therefore, couldn't one process running under my ID, if it was being deliberately malicious, just exec a debugger or dtrace (or include that functionality within its codebase) and hook into and read the memory of any other process I own anyway?

    1. Anonymous Coward
      Anonymous Coward

      The process to be infiltrated may not necessarily be "owned" by you, persay. A higher-security context may not be directly accessible to you but only available through syscalls or other means. Cache leaks like this mean you can still read their "private" contents even if it's not a process under your direct control.

  11. _LC_
    WTF?

    Next up: disable the cpu clock

    After nuking (flushing) L1, there's really not much room for "more security". Next up: disable the cpu clock!

    Let's face it, Intel CPUs are broken by design (cheating on correctness & security for more performance). They want to get home. Let's send them there and demand the money back. ;-)

    1. Anonymous Coward
      Anonymous Coward

      Re: Next up: disable the cpu clock

      "broken by design"

      The relationships between USA, Israel, NSA, ISNU, Intel corp. in Israel and USA are somewhat opaque.

      Misdirection of public to have concern about security at higher protocols when backdoors are already built in at chip level.

      Do AWS have an account type that can specify non-Intel hardware? Or perhaps just certain "regions" for particularly sensitive operations...

      Just assume everything can be compromised.

  12. T. F. M. Reader
    Joke

    Obligatory Covid-19 analogy

    1. There is a high risk population (of processes).

    2. They become paranoid about sanitizing (the cache).

    3. As a result, everybody is driven into lockdown (poor performance) and frequent (cache) flushing, etc., regardless of risk.

    4. In the end, Linus (who is Swedish even as he comes from Finland) applies "the Swedish model" and rejects said lockdown, citing the lack of compelling data in the process.

    Am I overthinking it?

    1. Anonymous Coward
      Anonymous Coward

      Re: Obligatory Covid-19 analogy

      Yes.

      Sweden, unlike the rest of the Nordic countries, has just caught up with Spain on death rate and may even aspire to the heights of the UK - currently holder of the highest death rate of the G7.

      But it isn't relevant in the slightest. People can choose how they host. They can if they want walk away from containerisation and all that stuff and have their own, dedicated real servers in a data centre. If you live in a developed country, there really is no way to opt out of possible exposure to coronavirus. And CPUs will change to mitigate the threats anyway. In a few years that mod would be like some of the junk DNA we possess that may to dealing with ancient viruses.

      1. _LC_
        Alert

        Re: Obligatory Covid-19 analogy

        That's a myth. Besides, the average age of those - allegedly - deceased by Covid in Sweden is 85.

        1. Anonymous Coward
          1. _LC_

            Re: Obligatory Covid-19 analogy

            ... and becoming immune.

            As Corona viruses tend to mutate rather quickly, it is noteworthy that the level of immunity achieved this way is MUCH higher than that of a potential vaccine. These people are, henceforward, immune to this disease; they can't spread it either.

            1. Anonymous Coward
              Anonymous Coward

              Re: Obligatory Covid-19 analogy

              Immunity via exposure is just as prone to failure by mutation as immunity via vaccine. All a vaccine does is trigger immune responses in the body without necessarily making you sick first.

              That said, the structure of COVID-19 does not lend itself well to significant rearranging, much as how measles (a notoriously stubborn virus otherwise) can't seem to find a way around our current level of vaccine tech. Influenza is much easier to rearrange, which is why we can't seem to peg down a universal vaccine for it just yet.

              The catch right now is how long the immunity effect lasts. There are hints that, like common cold coronavirii, the effect may not be as long as we'd like (months versus a year or two).

            2. Gordon 10
              FAIL

              Re: Obligatory Covid-19 analogy

              @_LC_

              Immunity means you do not get sick. It doesnt mean you cannot carry and spread it.

              Think of it as picking up poo with gloves on. You dont get it on you but you still stink.

            3. slimshady76
              Facepalm

              Re: Obligatory Covid-19 analogy

              Do you realize a lot of people got re-infected with the same strain they caught the first time, and all they got from the first contagion was the inability to spread the disease as fast as the first time they got infected, right?

              1. Gordon 10

                Re: Obligatory Covid-19 analogy

                @slim. I think you maybe confused. Can you provide a link to research to substantiate?

                It’s possible But I don’t believe it’s been proven that it happens at scale for Covid.

                1. Charles 9

                  Re: Obligatory Covid-19 analogy

                  AIUI, relapses have occurred, but the jury's still out on just HOW they're happening. In addition to mutated strains, other possibilities includes a dormant virus waking up again, immune system limitations (like in the common cold), and non-viral syndromes.

  13. petef

    The devil you know

    Leaving aside the performance hit for a moment what security analysis was done on the proposed feature? I'm not saying that it is obviously flawed but existing side channel attacks have taken a long time for white hats to identify.

  14. anonymous boring coward Silver badge

    Was any swearing sensored, or is Linus moderating himself now?

  15. Grumpy Rob

    Linus is right!

    Linus is right, in that it's not up to the Linux kernel to solve security problems of the type that AWS is trying to mitigate. If the data you're processing is so sensitive what the hell are you doing running on something in the cloud? It's not just when you're processing the data that you're vulnerable - the input and output data has to be stored in the cloud too. But cloud providers have never been hacked - right?

    I'm just amazed at the number of companies and government/semi-government organisations that are using cloud-based email and data processing/storage. Has no-one done a risk assessment of what could go wrong? But it seems the beancounters have taken over and cheaper beats secure every time - until an excavator or a fire takes out Internet connectivity and the business is left without any access to their corporate data for hours or even days. Yeah - much cheaper!

  16. Anonymous Coward
    Anonymous Coward

    Thansk Linus!

    Well, the Linus answer cannot be more clear and logical. Why to accept a patch with scared code that can be useful for only a narrow range of CPU which btw will slow down the whole machine? I'm impressed to see a guy like Linus to be incorruptible today, other guys will certainly accepts some $$ unde the table.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like