Retpoline support patches are coming soon for Linux
https://www.phoronix.com/scan.php?page=article&item=clear-kpti-retpoline&num=1
Intel's boss has finally admitted software fixes to address the Meltdown and Spectre vulnerabilities in most modern CPUs will incur a performance hit. At the Consumer Electronics Show in Las Vegas on Monday, Brian Krzanich stuck to the line the design weaknesses represented an "industry-wide issue across several different …
No, first of all the problems are different, it's a class of problems nobody bothered looking for, so obviously you will find them at virtually every processor manufacturer using out of order execution as well as some sort of MMU.
Just like in early cryptography many different people were making the same mistakes.
It's what happens when the architecture functions as specified, but the timing with which
the functionally correct results can instead be used to detect information.
Side band attacks are outright cruel from a designer perspective. Every tool they have for functional
correctness passes and gives green lights, but the timing with which it produces the correct result gives the game away.
The point being that all the architecturally invisible state, caches, branch history tables, speculative pipeline state etc... that is shared between processes and threads has been assumed to be benign since there is no instruction that returns that state to a visible register.
But... the state is shared and affects timing, and can be used to propagate information from one
context to another.
The meltdown case appears particularly poor in the level of trust in the assumptions of speculation & cancelling being benign.
From another perspective, TID, PCID whatever fields in TLB's are standard but have not been wholeheartedly adopted by Linux. I wrote my own (mini)kernel on a PowerPC 440, which had software TLB fill.
It's all a big mess... but one hardly needs "tinfoil hats" to see how this class of attack slipped past many
chip makers.
"I sense NSA or US Gov fingerprints on this one boys and girls"
Don't underestimate the ability of big tech companies to really stuff up, to copy each other's mistakes when they reverse engineer their competitors' new pieces of kit and to conveniently ignore what they think are only theoretical microvulnerabilities in their rush to get out shiny new pieces of kit to unsuspecting buyers.
FOUR different hardware manufacturers have EXACTLY the same flaw
The phrase "convergent evolution" springs to mind. They set out to solve a (mostly) identical set of functionalities and ended up with a (mostly) identical set of solutions.
The (mostly) bit is why AMD are not vulnerable to one of the exploits..
No paranoia or conspiracy theories required. Although I'm sure that the existing TLAs are more than happy to take advantage..
You are a bit too quick there.
Yes, making rtdsc, or equivalent, any other form of high resolution clock only "tick" at the 10s granularity instead of the nanosecond granularity would make it a lot harder. However, all of these "fixes" are susceptible to running the test for 10^n longer and still being able to read at 1/10^n the rate.
Making ALL forms of time information non-functional would leave a non-functional computer. Not
TCP/IP retransmits, no correct date stamps, nothing.
Making the computer completely, absolutely secure is possible if you give up functionality.
Take out the plug, burn the battery and fry the hard drive. The tricky thing is to not give up functionality and still keep the security.
Here, KPTI is giving up a bit of the functionality (performance) and that isn't great, but is the best
compromise currently.
To those that down voted the OP.
Why?
It's not trolling, it's not obnoxious, and to me looked like a genuine request for information from someone who had taken what knowledge they had, and drawn a sensible conclusion. The fact that their initial assumption was incorrect does not detract from the fact that some thought had gone into trying to come up with a valid solution to the problem.
To be honest I was just being silly but yes sometimes that can be missed or maybe it wasn't the case at all so you are right about down votes. I did try to use terminology like "invariably" to try and elude to what I was doing and also the sheer dumbness of the idea considering the issue., maybe it's a case of "to soon"
If there was some way to insert a random fuzz on the RDTSC instruction (which I imagine is the only timer with sufficient resolution to measure a cache miss) then that might work. Alternatively, is it possible to block access to RDTSC from user-space processes? If so, that might cut off one line of attack (though presumably still leave open the "attack VM host from guest kernel" vulnerability, which frankly ought to be scaring the cloud computing industry shitless.
If there was some way to insert a random fuzz on the RDTSC instruction (which I imagine is the only timer with sufficient resolution to measure a cache miss)
Except for the fact that RDTSC is a very useful profiling/debugging tool; Cripple it and you open a whole different can of worms.
which I imagine is the only timer with sufficient resolution to measure a cache miss
You imagine wrong, I'm afraid. Read the Spectre paper; their PoC in Javascript on Chrome didn't even use a proper timer, just a WebWorker thread continually decrementing a value in a shared array.
Side-channel information leakage is a probabilistic game, not all-or-nothing. Degrade the accuracy of the timer and the attacker just needs more iterations to derive a result with sufficient probability to be useful. (Consider, for example, that an attacker who gets the bits in a password or private key or session cookie with 90% now has reduced the brute-force work factor by an order of magnitude.)
Degrading the timer increases the attack work factor, which is useful in some situations; degrading it a lot may even make the attack infeasible. But just degrading it a little doesn't do the job.
There are no known exploits for the new design flaws - therefore you should update your systems.
The performance impact will be minimal - presumably less than 50 %, so your PC will still be way too fast (when the 1981 model with 640 KB of RAM is ideal and fast enough for all desktop computing).
I wonder if people who run both Linux and Windows will have to apply two different microcode updates.
I am curious to see the impact on sales of X86. Intel especially.
People who refresh gear on a cycle might delay to wait for silicon that is unaffected - which could take a couple years! People who were racing to get off Power or SPARC might do the same or might refresh in place on those platforms instead of paying for a migration.
I have customers who were thinking about migrating custom mainframe apps for their core business and manufacturing operations to SAP on X86/Linux which they might halt or slow down for the same reasons.
Even more so because of the slowdown caused by the patches, the exploit might not get them to move, but having to buy 20-40% more gear? That might blow up an already tight business case. The affect on cloud will also be interesting depending on how the providers react from a price/performance perspective.
We've carried out some initial benchmarks on a production cluster running Windows Server 2016 Hyper-V with IOMeter and other various tools, and its looking pretty horrendous.
We are seeing disk io performance halve in VMs. Numbers like 60,000 IOPS and 2GB/s dropping to 30,000 IOPS and 1GB/s.
We're also seeing CPU utilisation related to disk IO almost triple on the hosts.
Dell PowerEdge R730's with dual Intel Xeon E5-2697 v4, lots of 10Gb NICs. Same VM just moved between unpatched and patched (plus new bios) hosts.
Storage is all network based which is the killer for us
Virtualization takes the biggest hit from Meltdown, which Intel conveniently leaves out as they try to downplay this. With all the push towards 'cloud' the last few years, that hit is FAR larger than the 5% Intel has been claiming for such loads.
Whether that translates into some cloud wins for AMD will be interesting. Usually those don't get much publicity because cloud providers like to keep what they are doing under wraps, but it may show up in AMD's financial results by the end of the year.
Even if they already knew about it, Meltdown and Spectre are nearly useless to the TLAs for any Windows, Linux or OS X system - because they already have a pile of 0 days for those which are far worse. The noteworthy thing about Meltdown and Spectre is that is a hardware level bug, so it would work on ANY operating system - so if they had a target running some adversary's formally verified OS that was otherwise hack-proof, they might be able use this hardware flaw to gain access.
because they already have a pile of 0 days for those which are far worse
I agree, mostly, but it's important to note that Spectre in particular is nicely general (doesn't depend on the OS) and pretty much undetectable, which makes it potentially useful even for the widely-used OSes.
But you're right that for the common OS cases they have better attacks readily available. Well-resourced attackers would probably use CPU-targeting attacks like Meltdown and Spectre mostly against niche systems.
The technique is just rather neat and interesting, it's more likely that people implemented it for those reasons and never thought about security implications.
The 17 US security agencies who we PAY to look for these types of vulnerabilities and protect us have failed us due to incompetence or because it suited them: so I'd follow them.
I mean surely as the NSA I would try to "nudge" the processor designers into not looking into the ramifications of having speculative branching and caching.
I could very well understand that they simply didn't care about that in 1995 as, most computers running Intel CPUs were simple single-user machines with protected mode only seen as a crash containment technology, not as a security feature.
hmm.....
Sounds like telling a rock climber to not look down. What happens next ?
Nudging someone into not thinking about something is never a viable route.
It's far more likely they thought that a hard barrier preventing reading data into registers via an architected instruction was sufficient to guarantee security. Sideband is very sneaky; someone had a lot of fun figuring this out.
This isn't a case of "everyone making the same mistake" it is a case of no one having any idea this type of attack was even theoretically possible. Why do extra work if it isn't necessary? AMD missed out on Meltdown mostly due to dumb luck, it wasn't because they foresaw this and designed around it.
Trying to claim there's some relationship or conspiracy with everyone vulnerable is ridiculous. You might as well claim everyone copied the guy who first made a right show and left shoe shaped differently, instead of noticing the fact it is based on how human feet are shaped.
This isn't a case of "everyone making the same mistake" it is a case of no one having any idea this type of attack was even theoretically possible.
That's not really accurate. After Paul Kocher published his first timing-attack papers in the mid-90s, we knew it was theoretically possible. And indeed we've had several years of very successful cache timing attack demonstrations.
Also, if you look at the history of the Meltdown/Spectre disclosures, you'll find Kocher and Horn and Gruss, and probably others, talking about what prompted them to look for these issues. It was prior work or comments by other researchers, or personal projects that had been cooking for a while. It's more a case that, prior to last summer, no one who had put two and two together decided to publish.1
That's why we had four teams independently discovering Meltdown and Spectre (combined) over a few months. The necessary ideas have been in the air for years. Hell, Spectre isn't even the first family of Javascript-in-browser drive-by cache-timing attacks. And using side channels to recover data across privilege boundaries is old hat.
1Maybe there isn't anyone who'd developed these attacks earlier, but that seems pretty unlikely. I'm considering this the first public discovery, not the first actual discovery.
As this Intel's Meltdown problem has been around since 1995 it should be possible to track from the earliest CPU the designers involved and how they have influenced other chip designers to have it appear in Apple's and Qualcomm's ARM chips.
As Meltdown doesn't apply to Apple's and Qualcomm's existing ARM chips - only to a core family which isn't in production yet - it shouldn't be necessary.
In any case, there's nothing particularly subtle about Meltdown. Here's how it works:
The first three are pretty much universal for modern general-purpose CPUs. The last is the one that makes a difference. When you're designing your spec-ex circuitry, do you have it make that privilege-boundary check? It's not immediately obvious you should, because you know it's going to throw a fault and discard the results, so they're not directly visible anyway.
The whole idea of spec-ex is "program didn't see it, I didn't do it". Unfortunately side channels make a mockery of that idea.
Does it ... disable speculative execution? Change how kernel code is cached? Does it do F- all? What kind of performance penalties might I see if I install it?
It clearly doesn't eliminate the problem, or OS patches wouldn't be necessary. Does it improve the performance of the OS workarounds? The microcode is listed as being applicable to a huge range of processors. Is there a breakdown of what it tweaks for each processor type/family?
Does it ... disable speculative execution? Change how kernel code is cached? Does it do F- all?
It may be the update to enable IBRS.
I haven't seen a good description of IBRS yet, but presumably it does what it says on the tin - restricts when the CPU will speculatively execute an indirect branch. Unfortunately that's only one Spectre variant.
As for performance hit: Anyone's guess, but note where indirect branches are most commonly used: things like bytecode interpreters, state machines, and other logic constructs that are often implemented with some sort of jump table. Will it hit C++ virtual-method (vtable) invocation? I dunno. Meltdown fixes are easier to guess about, because we know when a process will context-switch from kernel to user, but estimates for those are still all over the place.