back to article Do you want speed or security as expected? Spectre CPU defenses can cripple performance on Linux in tests

The mitigations applied to exorcise Spectre, the family of data-leaking processor vulnerabilities, from computers hinders performance enough that disabling protection for the sake of speed may be preferable for some. Disclosed in 2018 and affecting designs by Intel, Arm, AMD and others to varying degrees, these speculative …

  1. chuckufarley
    Coat

    The Foundation of Computational Trust...

    ...Is defined by four words. They are: Security, Transparency, Stability, and Speed. Think of them as Mazlow's Hierarchy of Needs in a digital format. Without Speed there is no reason to uses computers instead of pen and paper. Without Stability there is no way to trust the Speed. Without Transparency there is no way to trust the Stability. Without Security there is no trust at all.

    I have spent a very long time thinking about this and I do not post it lightly. Please do not respond lightly.

    1. Anonymous Coward
      Anonymous Coward

      Re: The Foundation of Computational Trust...

      No complex subject can be broken down into "four words". Except if you work in marketing.

    2. sreynolds

      Re: The Foundation of Computational Trust...

      Why is there no compensation for users? I mean they are being sold something that doesn't live up to the hype. Surely someone is going to start a class action of say 25 bucks per physical core.

      1. IGotOut Silver badge

        Re: The Foundation of Computational Trust...

        They are working as advertised. If you put in code that slows it down, then that is your issue as far as they are concerned.

      2. HildyJ Silver badge
        Facepalm

        Re: The Foundation of Computational Trust...

        There is never any compensation for users, just lawyers.

    3. Blazde Silver badge
      Happy

      Re: The Foundation of Computational Trust...

      I feel the need… The need to trust speed.

      "Please do not respond lightly."

      Oops.

    4. DuncanLarge

      Re: The Foundation of Computational Trust...

      > Without Speed there is no reason to uses computers instead of pen and paper

      I think you vastly over estimate the speed of human computation using such methods.

      Try reading up on the creation of Colossus and of the Bombe during WW2 and you may find out that humans are shit at doing computation fast, even a 486 (running appropriate software) will knock the socks off a pen and paper.

      1. Michael Wojcik Silver badge

        Re: The Foundation of Computational Trust...

        A '486? Even a simple adder circuit at any reasonable clock speed is much faster than pen & paper.

        Humans (and other animals) can do a lot of unconscious processing very quickly, but anything that involves conscious thought and can be reduced to a feasibly-computable algorithm is going to be far, far slower than what a machine can do. Brains are good for other stuff.

    5. BHetrick

      Re: The Foundation of Computational Trust...

      “Well, gee, if you don’t need the answer to be right, we can get it for you really fast.” It seems to me that “my data is mine” is part of “right.”

  2. Anonymous Coward
    Anonymous Coward

    The trouble is...

    ...timing attacks are hard to solve without screwing performance.

    If the problem is that some operations take longer than others and by observing how long things take you can attack a system, then the only real option is to slow everything down. Either by doing unnecessary work, or by artificially delaying results.

    To be fair though, not *everything* that runs on a computer needs these mitigations all the time. It's long been known that you shouldn't execute programs with different security contexts at the same time. For example, don't schedule your private key signing operations and your web browser Javascript engine on different cores of the same processor at the same time.

    Maybe the few programs that actually need high performance should be able to request a "non-multitasking full screen mode" that temporarily disables mitigations, albeit causing everything else to run like garbage?

    1. Michael Wojcik Silver badge

      Re: The trouble is...

      More generally:

      1. Information thermodynamics says you can't discard intermediate results without leaking information.

      2. Discarding intermediate results is what computers do. All of computation is essentially compression: given an additional state in the domain of the physical machine, compress the space of the range down to a state which contains the desired result, then compress that down to the portion that's of interest.

      It's possible to perform computation without discarding intermediate results, using fully-reversible computation, but in the general case that means doubling your circuitry, with all the associated costs in space, power consumption, heat dissipation, and so on. (You actually don't double the latter two, because discarding results is a major cause of heat dissipation; an MIT experiment years ago showed a fully-reversible ALU used significantly less power than its conventional counterpart. But we can't eliminate inefficiency completely, of course.)

      There will always be side-channel information leakage. Some of it can be made inaccessible; the rest can be hidden by whitening. It's not an easy problem, however.

  3. hammarbtyp

    Interesting article

    The problem with most security articles is that they offer a black and white assessment of security. i.e there is a vulnerability and therefore you are exposed. Generally however there are a lot of greys. for example if an exploit requires local access only, then there are far fewer possibilities of exploits and other mitigations can be put in place.

    So for a bare metal embedded system running on the x86 platform (yes, there are such things), there maybe little benefit with the security features as long as code is signed and booted via a TPM etc. Also if you are trying to wring out every last clock cycle, you may have to turn off any CPU draining features. (although generally jitter is more important than raw speed)

    So in all cases potential exploits should not be taken at face value, and risk assessments should be done to define the mitigation strategy.

  4. Binraider Bronze badge

    Can one obtain a single threaded CPU today? And one deliberately without speculative execution? Is there space in the market for a deliberately simple processor to reduce attack vectors?

    Yes to all of the above? I think so. So where's the niche supplier to fill that need?

    1. Warm Braw Silver badge

      There are a lot of them lurking in embedded systems.

      The trouble is that the complexity of modern software and the scale of its user base means there is a requirement for performance that only probabilistic optimisations can deliver.

      A lot of CPUs give you at least some control over which performance features are enabled - probably a lot easier/cheaper to make that more granular than come up with an entirely new niche product that would have very limited use cases.

    2. Starace
      Devil

      Aerospace

      For aerospace you tend to run a scheduler per core without any bells or whistles like speculative execution - or sometimes even caching - so you can maintain determinism.

      But you don't need a special CPU for that. You just turn off the bits you don't need and run a bare minimum initialisation rather than using the OEM code.

      If you want a simple CPU you just use it in a simple way.

      1. Claptrap314 Silver badge

        Re: Aerospace

        The STI Cell microprocessor, for instance, had a TLB-like cache that could not be turned off. It was also broken in first hardware.

        Yes, Cell was a beast, but it is not at all safe to claim that bios setting alone are enough to make a microprocessor predictable.

        Source: I spent a decade at IBM & AMD doing microprocessor validation.

  5. sitta_europea Silver badge

    Quoting the article:

    "This is an extreme example to illustrate the point, we note."

    You haven't been paying attention. This is nothing like an extreme example.

    See my post here:

    https://forums.theregister.com/forum/all/2020/11/20/ibm_power9_specex_flaw/#c_4151005

  6. l8gravely

    Can I just slow down everyone on my EC2 instance?

    So my question is whether my 2x slowdown because I'm system call heavy also impacts un-related processes on the system as well. I.E. can I do a DoS on a system by just running syscall heavy apps, which don't actually use alot of CPU (to make it cheap) but also slows down everyone else on that core/socket?

    That's the question that isn't answered here.

    1. Anonymous Coward
      Anonymous Coward

      Re: Can I just slow down everyone on my EC2 instance?

      "That's the question that isn't answered here."

      I think the answer is, like most things:

      It depends...

      IOW: your question doesn't define enough specific conditions/parameters, and even with those, only someone who's deep in the guts of AWS would be able to answer, if they were so inclined (they probably aren't).

  7. Nate Amsden

    disable mitigations in OS - probably doesn't override firmware mitigations?

    Most of my systems came from before the Spectre stuff so I haven't installed the firmware updates that have those fixes in them(read some nasty stories about them most recently the worst of them here https://redd.it/nvy8ls), I have seen tons of firmware updates for HPE servers that are just updated microcode, fixes to other microcode, implying some serious issues with stability with the microcode.

    I have assumed linux commands to disable mitigation operate only at the kernel level and are unable to "undo" microcode level mitigations.

    On top of that on my vmware systems(esxi 6.5) I have kept the VIB(package) for microcode on the older version since this started. The risk associated with this vulnerability is so low in my use cases(and in pretty much every use case I've dealt with in the past 25 years) it's just not worth the downsides at this time. I can certainly understand if you are a service provider with no control over what your customers are doing.

    I can only hope for CPU/BIOS/EFI vendors to offer an option to disable the mitigations at that level so you can get the latest firmware with other fixes just disable that functionality. Probably won't happen which is too bad, but at least I've avoided a lot of pain for myself and my org in the meantime(pain as in having VM hosts randomly crash as a result of buggy microcode).

    I do have one VM host that crashes randomly, 3 times in the past year so far, only log indicates that it loses power sometimes 2-3 times in short succession(and there is 0 chance of power failure). No other failure indicated, not workload related. HPE wants me to upgrade the firmware but I don't think it's a firmware issue if dozens of other identical hosts aren't suffering the same fate. They say the behavior is similar to what they see in the buggy microcode, but that buggy microcode is not on the system. So in the meantime I just tell VMware DRS to not put more critical VMs on that host, as I don't want to replace random hardware until I have some idea of what is failing(or at least can reliably reproduce the behavior I ran a 72 hour full burn in after the first crash and full hardware diagnostics everything passed), sort of assuming perhaps the circuit board between the power supplies and rest of the system is flaking out but not sure. The first time it crashed so hard the iLO itself got hung up(could not log in) and I had to completely power cycle the server from the PDUs(personally never happened to me before), iLO did not hang on the other two crashes. Server is probably 5 years old now.

    Another Q is if version "X" of microcode is installed at the firmware/BIOS/EFI level, and the OS tries to install microcode "V" (older), does that work? or does the cpu ignore it(perhaps silently?). Haven't looked into it but have been wondering that for some time now. I'm not even sure how to check the version of microcode that is in use(haven't looked into it either). Seems like something that should be tracked though especially given microcode can come from either an system bios/firmware update and/or the OS itself.

    1. yetanotheraoc

      Re: disable mitigations in OS - probably doesn't override firmware mitigations?

      You wrote:

      "They say the behavior is similar to what they see in the buggy microcode, but that buggy microcode is not on the system."

      And then you wrote:

      "I'm not even sure how to check the version of microcode that is in use(haven't looked into it either). Seems like something that should be tracked though especially given microcode can come from either an system bios/firmware update and/or the OS itself."

      It doesn't compute. Do you know what microcode is running, or do you not know?

  8. Paul Hovnanian Silver badge

    Can one ...

    ... turn on/off the anti-Spectre mitigation per thread? Per core?

    I can run my sensitive stuff (logins, encrypt, decrypt) on a slow processor. Or turn protection off when I don't care if malicious actors steal my GTA high score.

    1. Claptrap314 Silver badge

      Re: Can one ...

      A big chunk of those mitigations are made by the compiler. For them? No. As for others, threads are realized as little more than a single bit in the register files, so no. Per core? Probably, but given the nature of Spectre, it makes no sense.

  9. Claptrap314 Silver badge
    Megaphone

    Where is the news here?

    I, and others here with a similar level of expertise, have demonstrated that slow downs much bigger than this were expected since El Reg broke this story. Those claims were quickly validated. I also predicted that there would be a market for dedicated machines sans mitigations.

    I had an interview a year and a half ago, wherein I observed that given the company's workloads, it would likely be cheaper to rent full boxes & turn the mitigations off. "Yes, we are doing that." I reported this fact in a Spectre-related story shortly afterwards.

    So again--where is the news here?

    1. Michael Wojcik Silver badge

      Re: Where is the news here?

      All true, but additional evidence is always useful to those trying to make the business case.

  10. Michael Wojcik Silver badge

    We don't know about Spectre-class exploits

    no miscreants are abusing the weaknesses in the real world to steal information, to the best of our knowledge

    This statement (which we've seen from many people, in many places) is disingenuous. We're not likely to ever know about the vast majority of exploits of microarchitectural side channels. Such attacks are difficult to detect, will likely to be carried out in environments where no one is scanning for them, and will result in a wide variety of attacks which will simply remain unexplained or undetected (such as information exfiltration which is never attributed).

    Not everything is ransomware or website defacement. A great deal of IT security breaches simply go unremarked.

  11. Anonymous Coward
    Anonymous Coward

    What's described in the article is spot-on: Software that executes a lot of system calls suffers dearly from the Spectre mitigations. I painfully remember that I suddenly had to throw 10 times as much CPU power after one specific production workload.

    It's clear that AWS marketing would weasel their way out of this with "most of our customers" talk, but that doesn't save the few unlucky ones from sinking a lot more money into them to make up for the mitigations...

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Biting the hand that feeds IT © 1998–2021