OY! 'Nuff o' this shite!
Hypersupermultiubergigathreading this and teraultrahypersmeggingthreading that -- fek it all fer a lark. Bin 'em all an' go back to a trusty Acorn.
Linux kernel developers are fixing up a trio of weaknesses in the open-source project – after a Google engineer reported that defenses implemented to stop speculative-execution snooping do not work as intended. In three posts marked urgent to the Linux kernel mailing list on Tuesday, Anthony Steinhauser points out problems …
Is it possible to just remove nanosecond-accuracy timing sources from Linux?
Is there a technical problem with this, like instruction counter or something that's hard to trap? Otherwise, this'd let most mitigations be left off with no ill side effects.
The short answer is no.
Nor can you simply enforce any sort of useful gatekeeping around it. You can't say that "only trusted code" gets access to the timers, because malware doesn't respect your rules as-is and there's no effective way to sign all the possible valid apps at this late stage. If it had been baked in from day one, maybe -- but even then it would probably be more hassle than its worth, as high-accuracy timers are not themselves the problem, they're just a means of exploiting it.
I'll bite. Specifically what apps use nano-second timers. Explotation of these 'cpu' flaws is esoteric lab research at best. It's easier just to phish passwords. I'm still not seeing how one could realistically exploit this stuff. You gotta know the workload, know the process, and sniff aound petabytes of data until you match something. All the POC's involved running two processes in a controlled environment, or sniffing around with a known workload passing thru at a known time, I mean if I got all that, I'm pretty sure I already have you pwned.
There has not been a single ransomware tied back to meltdown or spectre. If resticting access to precision timers is feasable, and takes the bulk of the risk away, it beats crippling performance for day-to-day worloads. I know it's *possible* someone could inject a stored procedure on my sql server, that daisy chains something to exploit meltdown. My only question would be why? I'm already f*cked. If it makes you sleep better by all means have fun.
Years ago some AMD hardware had defective High Precision Event Timer hardware which had to be disabled to keep operating systems happy. This often became company policy which remained in place long after all the defective hardware was retired. Clearly plenty of people were unaware of any problems caused by disabling perfectly good HPETs.
There is a wide variety of timer hardware and well made software should ask the OS what is available, what each is capable of and use the most appropriate for the task. There are piles of web pages about the effect of re-enabling HPETs on various games. Occasionally someone noticed a clear improvement. Lots noticed about one extra frame per second and some did not spot any measurable change. To me this looks like many games are able to cope with whatever hardware they find with just the odd combination of hardware and software benefiting from a specific extra time source.
"less /proc/timer_list" shows that lots of timers are being used but it is not obvious what they are being used for and how much precision they really need. An off switch could cause noticeable problems but a reduced precision knob might be harmless and useful.
Now that I could go for -- a setting you can dial in per timer, so if you bork something in your environment -- kill a critical app by mistake -- you can dial it back to the required precision without having to revert the OS to a previous version.
Probably a kernel compile switch. Dynamic settings can potentially be buggered with.
Pff, like I know *specifically* what apps would be affected. I was pointing out the generic class of problem and as I noted, it's impossible to know with certainty what *would* be affected; all we can be certain of is that a lot of widely-divergent code paths *could* be affected because people (rightly or wrongly) use high-precision timers all over the place, for a wide variety of reasons.
As far as "my sql server" goes: you're not wrong, but you're not thinking about it in the right terms if that's your stance. If your server is physically distinct, on your own network, you control all access, etc, then sure, fine, you're right -- this sort of attack isn't really a problem for you and if it becomes one you're already fucked because they already have access to your hardware to pull it off -- and as we all know, if they have access to the hardware, you're fucked.
But that's not the target profile.
This sort of attack is the sort that opens you up to loss because some idiot running alongside your instance in the datacenter wasn't careful and the attacker escaped *their* sandbox, likely using some other exploit, and is now running their own process at the hypervisor layer, and what they're after is the encryption keys (for example) so they can peer into anybody's process space at will. You're probably not being targeted at all. It's a shotgun approach.
So datacenter operators are understandably worried about it. People who operate in the cloud *should* be worried about it, though I don't recommend losing sleep at the moment. Chip manufacturers are sweating bullets about it because they know just how bad it could really be, even if your (bold!) statement about "no single ransomware tied back to meltdown or spectre" holds true. But dedicated-instance operators? Meh. As you note, there are better ways to get you, if that's your setup.
And to be clear, even in the datacenter, if you are renting machines, you are essentially in the clear. In fact, I interviewed with one company that is doing just that--and running without Spectre mitigations and their eye-watering costs.
And, by "in the clear", I mean that anyone hitting you with Spectre has already compromised you to the point they don't need to.
Profilers for one, accurate timing quite useful there.
The basic attacks work on the principle of measuring latency of access to cache lines, it's not really about the precision of the timer.
Side note, crap timers can be made more accurate, with more iterations.
As an example, I dont have a <30ns timer on this machine, and other stuff is happening, like me waffling on elreg, so if I run my loop, a eleventy-billion times, averaging out the measurements, I will coverage on a very precise measurement.