Old is new again?
Theo from OpenBSD had a rant about Intel and similar problem in 2007. https://www.theregister.co.uk/2007/06/28/core_2_duo_errata/
And people say I'm crazy for using SPARC.
In the wake of The Register's report on Tuesday about the vulnerabilities affecting Intel chips, Chipzilla on Wednesday issued a press release to address the problems disclosed by Google's security researchers that afternoon. To help put Intel's claims into context, we've annotated the text. Bold is Intel's spin. Intel and …
From BBC FONUIZ: Rush to fix 'serious' computer chip flaws
Typically, both parties agree not to publicise the problem until a fix has been implemented, so that criminals cannot take advantage of the issue.
This time it looks like somebody jumped the gun and information was leaked before a software fix was ready for distribution.
I haven't heard about anyone "this time" mysteriously jumping any gun, what do they mean by that?
"This time it looks like somebody jumped the gun and information was leaked before a software fix was ready for distribution."
Google have got form for disclosing security flaws before fixes are ready.
I can't help wondering if they have some new AMD-powered Chromebook waiting in the wings to hit the market in a couple of weeks time...
"Google have got form for disclosing security flaws before fixes are ready."
Giving the other party 90 days to fix, followed by them going into radio silence up to and beyond the 90 days isn't quite the same what happened here.
Seems more likely that these bugs affected so many systems that many more folk needed to be told. When so many people have a need to know, a leak becomes almost inevitable.
> I haven't heard about anyone "this time" mysteriously jumping any gun, what do they mean by that?
As the article reads like a cut 'n' paste job, it's not worth while trying to analyse it. The layers of obfuscation are geological in complexity, suggestions it's just a software issue, poor Intel with some fearful leaker being nasty to them, suggestions of it all being sorted if only they'd had time etc etc. Even El Reg has had to conflate two separate issues, and each is complex. Expect colleagues, family and friends to start conversations with you today with "I don't understand what they mean..."
Positive, he's the only person running it :-)
Nah. We have a couple of old[1] SPARC-based E450's in the computer room doing old[2] Oracley stuff. Fortunately, one of them just got supplanted by a system upgrade so I'll be able to switch it off soon.
[1] Can anyone else remember when the E450's were new and oooh-shiny? I can..
[2] As in 'written last century' old. Or as we like to call it "johnny-come-lately new fangled stuff"
Actually, because of the design of the SPARC processor, it is immune to Meltdown. To be technical, unlike x86, SPARC processors have a separate TLB (Translation Lookaside Buffer) for kernel pages only. That's the source of the slowdown on x86. With the kernel fully removed from the process address space, a full context switch is needed because now you are fully switching address spaces, and the TLB contents are dumped. For a TLB miss, it takes 2-5 memory accesses to read in a page table entry, and at roughly 20ns access time compared to sub 1.0ns access time for a cache hit, you are looking at a performance hit that is two orders of magnitude slower than a cache hit. In case you are wondering, the TLB is the cache that is used by the memory management unit for translating virtual addresses into physical addresses.
The Sparc may or may not be vulnerable, but I don't think this explanation covers it.
The Intel issue comes from a speculative code path loading data into the L1 cache from mapped but protected memory. This has nothing to do with TLBs -- the real question is whether the kernel address space is accessible to speculatively executed instructions.
Intel's error came from allowing a speculative load to proceed even when the ring # was wrong to allow the access to complete.
SPARC systems based on SPARC V9 (since 1993) have Kernel page table isolation and so have separated kernel and process address spaces so should not be affected by Meltdown. Heres the documentation : https://books.google.se/books?id=3Ys27a_I1tEC&pg=PT552&lpg=PT552&dq=address+spaces+on+SPARC+systems&source=bl&ots=5mwktQtlHt&sig=mIu8LXvWq1LbN3kqS6PWP-5ldvU&hl=sv&sa=X&ved=0ahUKEwin_9OD2r7YAhXQJ1AKHUcIA38Q6AEIRjAD#v=onepage&q=address%20spaces%20on%20SPARC%20systems&f=false
If you patch your hypervisor and guest OS, do you take a double hit on performance?
Also, according to Red Hat's security note, Spectre affects all CPUs that use speculative execution, including AMD, ARM, and POWER architectures. Are they incorrect?
Finally, if you're going to label one vulnerability Spectre, why not call the other one SMERSH?
Inquiring minds want to know!
"If you patch your hypervisor and guest OS, do you take a double hit on performance?"
As the bug is all about preemtive execution which happens at the microcode level you should only have to patch the hypervisor. Linux at least is allowing you to turn off the patch, presumably so guest OS does not have to take the hit as well.
As for Windows that is anyones guess atm!
Good question. MS are seeming to indicate that if the Hypervisor is patched, the guest is protected. So will the OS detect that it's running on a patched OS and not "double implement" the memory protection?
https://azure.microsoft.com/en-us/blog/securing-azure-customers-from-cpu-vulnerability/
The hypervisor patch protects the hypervisor provider, not the guest OS user.
The guest isn't protected until it is patched.
Azure/AWS/Goog etc are all protecting their infrastructure from pivots between customers. If customers don't patch their own systems they are unprotected. Remember cloud providers protect the cloud, customers protect themselves.
There shouldnt be a double performance penalty though.
Not 100% but I dont think that is correct analysis, if its a hardware bug patched in hypervisor software, you dont need to patch virtual servers built on top. unless the virtualization software has the same bug as the hardware.
Ie if the hypervisor flushed the data that was speculativly executed, its not there for the vm to abuse either.
If only the hypervisor needs to be patched, why are Microsoft rebooting affected Virtual Machines on Azure? Surely they can migrate them between patched hosts?
MS being lazy, or something else?
PS, I'm a software guy, not hardware or OS, so be kind...
Not being at Microsoft I couldn't say for sure, but my guess would be the scale of the problem. They might just not have enough servers to migrate people en-mass that way and have found it's quicker and less risky to have a small amount of down time to just patch everything immediately
[Intel CEO is invited to explain his share activities to long time investor E. S. Blofeld]
"We don't take kindly to failure, Mr Krzanich. Security, please show Mr Krzanich my collection of tropical fish. The carnivorous ones will be particularly interesting..."
"In Azure, there is no Live Migration facility. "
Azure has live migration now - you can pay extra for it.
I'd guess that capacity and the scale of the issue is the bigger thing here; they need to patch every box in the datacenter and need to do it fast. Easier to just bounce all the hosts at once than schedule a bunch of VMs cascading across the network.
I was trying to work this out... on the whole I concluded I didn't understand enough to know but if the compiler generates code that does speculative loads aren't you still screwed (although of course you can't blame the processor, just the compiler). See https://blogs.msdn.microsoft.com/oldnewthing/20150804-00/?p=91181
if the compiler generates code that does speculative loads aren't you still screwed
I'd think not, or much less so. Speculative accesses inside the processor that never get committed could perhaps access memory that the user shouldn't have permission to touch, but compiler-generated speculative loads are going to have to use standard instructions, albeit it out-of-order. They will be constrained by the protections available to whatever context the complete instructions are executed in, i.e. a user process won't be able to access kernel memory.
They're also easier to fix, either by a compiler patch or perhaps some post-processing of binaries.
With the Itanium processor virtual addresses are global, the bits of the kernel have 1 set of virtual addresses while all user processes have other virtual addresses. The cache tags and the TLB basically know who a bit of addresses space belongs to. Sadly it's been too many years since I played that far inside the Itanium to remember whether there is a region register which is programmable from user privilege level (like there is 1 user programmable space register on PA-Risc) but even if a long virtual pointer were used from assembler the access would be blocked by the TLB's protection mechanism.
The extensive Google Blog post suggests that AMD processors are only "less" vulnerable - and I don't imagine the investigations have yet stopped. Specifically, they found they could use side-channel attacks to get memory contents from the same process on an AMD chip (hardly a big deal, but still a warning flag) and they could read the entire contents of kernel memory on an AMD chip IFF the Berkeley Packet Filter (BPF) Just-In-Time compiler is enabled in the kernel. Is that an AMD bug or a Linux bug? Can you actually assign "blame"?
Since the entire vulnerability is related to the relative execution time of cached and non-cached operations, it's difficult to believe that there are not other potential exploits to be discovered. The BPF issue is interesting because it means that the ability to inject any abitrary code into the kernel, even code that is statically proven to be "safe" in traditional software terms, is in fact a potential vector for side-channel attacks for which there is no obvious mitigation.
That's a very big deal for a lot of Linux-based firewalls and probably many other applications.
they could read the entire contents of kernel memory on an AMD chip IFF the Berkeley Packet Filter (BPF) Just-In-Time compiler is enabled in the kernel.
The name "Berkeley Packet Filter" should be a give-away - this is part of the firewall in FreeBSD derived systems, Linux uses a different firewall, as does OpenBSD. This may affect a large number of routers which use BSD derived code - a very high risk since, in most cases, (a) this is not obvious to the owner/user, and (b) they are very unlikely to be patched.
Routers are a great target for malware - because they are Internet connected and always on.
The good news is that this should be easily patched IF the manufacturer is threatened with sufficiently serious consequences - which may or may not include "cruel and inhuman torture" - IANAL.
Linux uses a different firewall
The Linux kernel supports eBPF since version 3.18 and the exploit was demonstrated using a Debian distribution, though by default eBPF would not be a configured kernel option so it might not be so widely used.
It isn't necessarily an easy patch (for any architecture) and it applies to any JIT code (not just BPF), so I assume there may be some impact on nft for Linux. ARM recommend changing the code emitted by the JIT compiler using new conditional speculation barriers. I'm not sure what options are being proposed for other platforms, but they could well have performance implications of their own aside from those already being discussed.
Bad form, I know, to reply to my own post, but there was one other thing that occurred to me.
Computer architecture has historically assumed that you controlled your computer and the workload that ran on it. The protections in place were there largely to mitigate against mistakes - bugs in your software taking down other software or the computer itself. For the most part, your computer ran software whose behaviour was predictable. The improvements in memory capacity and CPU speed largely depend on that predictability.
The reason that Spectre, Meltdown and, previously, Rowhammer are issues is mostly because the assumption of ownership is no longer valid. Either you have consciously chosen to run your software on someone else's computer (cloud) or it's possible that ownership of what you believe is your own computer has been ceded to criminals (possibly with the backing of state resources).
If you control your own computer, it doesn't really matter if you can read the kernel memory from user space. If you don't, pretty much every statistically-based optimisation (whether it's DRAM stability or branch prediction) is up for exploitation by software that's designed to skew the statistics and either gain knowledge that it shoudn't have or deny service (for example by forcing cache flushes).
Computer architecture hasn't changed a great deal in principle from the days of co-operative time sharing - perhaps it's time it needs to be reinvented for an explicitly hostile environment.
"If you control your own computer, it doesn't really matter if you can read the kernel memory from user space."
Until you go to a website that has some dodgy javascript running in user space (natch), which can then start reading stuff in kernel memory (or presumably anywhere else).
Unless you're writing or auditing every byte of code that runs on your computer, you have to start hoping that you truly control your computer.
Browser vendors are already rushing to prevent these attacks being exploitable from JavaScript. The attacks require precise timing measurements, so they are reducing the precision of timers available to JavaScript. This will make them very hard to exploit from JavaScript.
Actually, no. Mainframes could run different workload for very different users, and when they started to rent their "time", they had to ensure separation of workloads.
But x86 CPU have their roots in simple chips designed with no security at all. When protection mechanism were added, often they weren't use because the incurred overhead was deemed to high in performance terms.
In many ways, mapping kernel memory to user space but keeping it "hidden" trough the paging mechanism is a kind of performance trick. From a security point of view, having to switch address space fully is much more secure - the problem is that with the current implementations that's slow.
That's the same reason why the full four rings, segments, etc. were never used, too many cycle required when going through a ring boundary. IMHO CPU design should look at ways to reduce the security checks overhead properly, instead just trying to bypass it.
Re: If you control your own computer, it doesn't really matter if you can read the kernel memory from user space.
Hardly.
Take Berkeley Unix, which is where Unix page based memory management came from. The assumption here is that the computer is owned and run by the CS dept. but used by bloody students who's primary interest is in buggering things up. No OS designed has ever wanted to have it's internal data read by unprivileged user code. If such behaviour was considered a "good thing" then /dev/kmem would be world readable. The kernel often has data which should be private to other processes running on the same system so it damn well should ensure that it's private.
"If you control your own computer, it doesn't really matter if you can read the kernel memory from user space."
This is true only if you personally wrote and/or did a proper audit of every piece of software your computer runs, and if it doesn't communicate with other computers.
Otherwise, it matters.
Computer architecture has historically assumed that you controlled your computer and the workload that ran on it.
Unix has historically assumed it was running on the University computer, and every single intelligent student was hell-bent on hacking it.
Large machines prior to the advent of Wintel faced similar levels of attempted assaults - by people who had detailed knowledge of the architecture - including schematics, and many years of assembler experience with the knowledge that National security was at risk (or possibly CISC :-).
The combination of developments that is Intel, MS, high level languages and the concept of a personal computer mean that machines developed with the security needs of an Apple ][ are now able to exceed the throughput of a Beowolf cluster of Crays.
This took place without anyone thinking there might be a need to re-examine a few assumptions and review security consequences of incremental changes (or they did, and were told to keep their mouths shut).
Sadly it doesn't seem to be "just" a microcode error that could be mended in the microcode.
As for 64bit, AMD was indeed ahead of Intel but one could also point out that 64bit processors were available years before that.
The Wiki on AMD 64bit:
"The original specification, created by AMD and released in 2000, has been implemented by AMD, Intel and VIA. The AMD K8 processor was the first to implement the architecture; this was the first significant addition to the x86 architecture designed by a company other than Intel. Intel was forced to follow suit and introduced a modified NetBurst family which was fully software-compatible with AMD's design and specification."
Ps AMD never copied Intel, the had tp do a "clean room" to do the microcode themselves.
"Ps AMD never copied Intel, the had tp do a "clean room" to do the microcode themselves."
That's flat out wrong. :)
Back in the day various entities such as the "Defence" contractors and big vendors required a chip to have a 'second source' vendor. AMD entered into a licensing agreement with Intel to be the second source for x86 parts - thereby enabling Intel to tender for those contracts. At one stage AMD were literally given a set of masks by Intel, and AMD used them to punch out identical - so strictly speaking they did in fact copy the Intel parts, but quite legally as per their second sourcing agreements.
As time has gone by AMD did some tweaks (eg: faster 286s, 386s, 486s which inspired Intel to unleash the lawyers at various points). Eventually they rolled their own in house designs (K5,K6,Opteron et al) - on the back of those second source agreements. Intel & AMD have continued to spend money in court wrangling over those agreements - but I think that's been settled for a good few years now.
This post has been deleted by its author
The Intel security folks must be glad to be at arms length and McAfee again, all this PR spin no substance makes a mockery of the companies security posture. Someone other than a hacker somewhere must have been able to see the potential for side channel attacks and remediate them in the core design surely?
If they need a recall what about the KAISER bill?
I noticed there was no mention of the possibility this was an intentional act to boost product benchmark scores, compared to Intel's rival(s), by purposely avoiding a known processing bottleneck.
It's certainly not tin-foil hat territory to expect corporations to cut corners to garner a competitive edge of rival products, with the understanding of we'll admit to the insecurity later, after we've made $BILLIONS in profit, and roll out the fix.
Cynically, I have no doubt of shameful actions such as the above-mentioned, because rapacious corporations, C-level officers, and board members must absolve themselves of any human decency to maintain a position of privilege, and that supplants anything of value.
I noticed there was no mention of the possibility this was an intentional act to boost product benchmark scores, compared to Intel's rival(s), by purposely avoiding a known processing bottleneck. .... red03golf
Will the future cost and markets hit on Intel eclipse the $30 billion charge on VW for their dieselgate affair, for is not the game the same albeit it being played in another arena/industry/market?
"Will the future cost and markets hit on Intel eclipse the $30 billion charge on VW for their dieselgate affair"
That depends partly on whether proof can be found that this was a known vulnerability that was ignored, or just wasn't paid attention to in the rush for speed (willful harm vs [gross?] negligence ).
Also, almost half the $30bn is in government fines . VW was in deep doo-doo because they circumvnted government-mandated tests. Even if Intel did this on purpose to boost benchmarks, it's not any government-mandated benchmarks so they are unlikely to face such large government fines. On the other hand, if they are subject to class action lawsuits, going over $30bn is quite likely since it involves almost every chip they made in the last 10+ years, meaning in practice almost every Intel chip currently installed.
That also makes recall / replacement unfeasible because (a) they don't have any non-vulnerable chips to replace the bad ones and (b) even when (if?) they do, the scale would be too great to handle. Say 5 years production of chips just to replace faulty ones, meaning 5 years of zero revenue.
Normally that would be enough for Intel to go bankrupt but I suspect it will somehow be saved, firstly because US gov will definitely intervene but also because for all Intel's (many, many) faults, the chip hardware industry like any other needs competition. An industry with only AMD and ARM providing chips is very closed-shop. I'd much rather see Intel survive but knocked back a few pegs, to end up with 3 big manufacturers competing on a much more level footing than they are now.
CPUs used to plug in.
Now they are BGA. So only patches, firewalls on servers, script blockers on Browsers are the overall solution.
Smart TVs: Don't connect to Internet, they mostly won't get updates.
IoT: You don't want them anyway.
The mind boggles as to how this will get sorted. Though Intel is a USA company, so maybe they'll mostly ignore it.
Years ago I thought "Out of Order Execution" was a performance boost at too high a cost. Debugging and timing issues. Messes up Real Time OS, where predicted speed /timing is more important than raw performance, "Out of Order Execution" causes jitter on physical I/O timing unless you have HW timer support and/or HW buffers with own IO CPU etc.
So far as a class-action is concerned, where are the damages? Operating systems are fudging over the problem and there is no proof (yet) anyone was actually compromised with this vulnerability, so the likelihood of a class-action proceeding without real damages seems slim.
Not saying Intel should not suffer in some manner for producing faulty kit, just that without any harm done there is nothing to claim in court.
"So far as a class-action is concerned, where are the damages?"
Damages are easy to show for commercial users: there's the cost of remediation, and the cost of any reduced performance (which would necessitate purchasing additional hardware to make up for).
The bigger question, in my mind, is... was there negligence? Honest mistakes and design oversights happen, and aren't necessarily something that you can successfully sue over. If, however, Intel knew, or if they should have known, there was a problem and sold the chips anyway, then we're talking actionable negligence.
"Funnily enough, no one said the security flaws could be used to directly alter data. Instead of talking about what these exploits don't do, let's focus on what they make possible."
Get access to keys stored in trusted zone? I am sure pirates will not be patching so they can try to wander around areas of the security system not previously accessible by tricking kernel to load pages they want.
Would the litigious media industry use 'operating by design' as an admission of fault?
My initial reaction was, is this article really necessary? Of course they're going to put a spin on things. But while your first translation was a little harsh, I felt, the others were merely brutally honest.
Security flaws are hard to anticipate, and processors are designed for performance first and foremost.
Now, if there were a way to turn off the security fix for software trusted not to be trying to exploit anything, performance hits might be reduced...
"Not to mention dishonest..."
I'd go further and call it lying. Corporate lying. The person writing the PR may have been writing honestly based on what they had been told, but at least some of the people providing the information either lied outright or lied by omission. Arse covering, plausible deniable, all adding up to corporate lying.
There are 3 known CVEs related to this issue in combination with Intel, AMD, and ARM architectures. Additional exploits for other architectures are also known to exist. These include IBM System Z, POWER8 (Big Endian and Little Endian), and POWER9 (Little Endian).
https://access.redhat.com/security/vulnerabilities/speculativeexecution
https://tenfourfox.blogspot.co.uk/2018/01/is-powerpc-susceptible-to-spectre-yep.html
PowerPC too apparently, although I imagine a much smaller install base. I have an old unaffected Atom based board here somewhere that could be used to run a pfSense based firewall or something but I think it contains the Intel Management Engine so err.. maybe not. Surprised this didn't surface as part of the Shadow Brokers releases as surely nation states know about this kind of thing already? Anyway whoever first shares some easily reusable exploit shell-code for 'testing' this spectre vulnerability to grab juicy secrets might be speculatively executed anyway!
Interesting. I wonder whether SPARC, MIPS, Loongson, Sunway, etc. are vulnerable to Spectre?
We keep forgetting that 1) all scripts and executables shall be executed without modification only from read-only storage, and 2) the read-only storage shall be modified only by a trusted configuration management process.
I'm surprised at the note in the article (and elsewhere according to an admittedly brief search) that game performance is not significantly affected by KPTI. Talking to hardware such as a storage controller or network card is mediated by the kernel; this is necessary for security and stability. And applications doing this are affected.
Game code, running on the CPU, needs to continually update the "world" the graphics card draws so, like an application heavy on storage or network I/O, I'd inductively expected games to frequently need kernel transitions and be highly affected. Can anyone enlighten me as to why this is apparently wrong? Is low level access to control the GPU provided into userspace, or are operations batched up to reduce the number of transitions, or..?
"Pure guess here - it may be just one single kernel transition to update the world per frame."
Yeah, that's the sort of thing I was thinking of by "batched up". I'm sure once (or even a dozen times) per frame would make KPTI utterly irrelevant from a performance perspective.
Once the connection to the framebuffer has been established, I'm pretty certain that the game then just talks directly to the video card so likely no Kernel transitions needed at all after the initialisation. Possibly, if the game needs to tear down the connection and re-init the display (eg: if graphics mode has been reset, or if new shaders need to be compiled) .. but generally these chores are done during setup and then left alone during gameplay.
Considering that a certain game I'm having trouble wrestling with performance-wise currently is so network-hungry that the network LEDs never go off and the (beyond miserable) current framerate is actually acknowledged to be server-code-limited (!!!), and also due to its minimum RAM requirements of 16GB it typically trashes the page file so incessantly on anything with less RAM that the main suggested fix on forums is to move that to an SSD (until it melts right through it...) - I'm, erm, SLIGHTLY skeptical that "gamers" need not be concerned.
@dropBear:
Your description puts the game my middle child is all over right now to mind, and being that the game itself is a) Alpha phase b) crowdfunded and c) being built by an agile devops hipster high on maccha most of the time I suspect that the issues will get worse before they get better. And they problems have zilch to do with this pair of bugs, although the bugs will make the game far worse, extend the deveopment time, and increase the crowdfunding demand.
Well then, everyone seems to have a perfect grasp on which game it is - yeah, that one... :) And yes, I do know the mentioned issues are of course not caused by these bugs or their fixes, but I do believe the fixes would impact the already piss-poor performance which is why I mentioned it at all. Fingers crossed performance will get finally addressed at some point of course, but at this rate that's effectively the same thing as "never", and until I see effortless high performance in that game my concerns that disk/net access penalties _will_ have an impact remain...
Ditto the rather daft comment about joe casual user.
While he will not notice things slowing down (the CPU has grunt to spare), he will notice a 30%+ battery hit. Some of it from cost of IO (cpu burning more), some of it from cost of idling (going into idle and out now costs more too).
Those twats are responsible for this late night patch frenzy that my company and countless thousands more, are dealing with. Their excuse for releasing the details is thin at best. So now, instead of a small number of nefarious types speculating about these exploits, the whole world + dog knows and vendors were forced to release their patches out of band. Good job ass holes.
Those twats are responsible for this late night patch frenzy that my company and countless thousands more, are dealing with. Their excuse for releasing the details is thin at best.
Messenger, blamed, much?
You seem to be upset. Your company may exhibit panicpants syndrome. Maybe IT is not for you.
We find that The team said in a blog post that it discovered the vulnerability in May 2017, and quickly notified Intel, AMD, and ARM.
Sounds like more than 90 days. And what is this "out of band release forcing" you are talking about?
You didn't have to have a patch frenzy last night. You could have carried on and hoped that it wasn't exploited - that's what you've been doing for the last 10 years anyway?
Security researchers already new about the issue for at least 6 months, people were writing code into Linux that was visible in the source. There were mermerings from many o=people that something big was about to hit and The Register had already reported the issue yesterday.
So what most decent sysadmins wanted after that was detail so they could mitigate as much as possible and manage the risk (would you have preferred that the only people who knew about it were security researchers of the "white hat", "grey hat", "government hat" and "black hat" brigades or would you prefer to have a patch that allowed you to get on with fixing it?). Therefore it looks like Google decided to provide that detail. If the patch is available then the vulnerability is known and you would need to have a patch frenzy anyway - if you're on about timing (having to do it at night) then lots of people somewhere will always have the patches landing in the evening or night - something to do with timezones.
If you live your world of security vulnerabilities by "in band" patch Tuesdays then I suggest your company has more problems than the details of a potentially game changing flaw being released.
The problem has been known, to some, for months. It’s been known long enough that _Apple_, not known for rapid patching, already quietly patched macOS! Apparently it’s only a partial patch, but it was out _before_ the news broke! And an alleged full patch is due with the next ‘in band’ update, currently in beta! If you boys are so slow that _Apple_ beats your ass to getting updates out, you’re got other problems, mate!
Who on earth stores passwords and sensitive info in clear text, even in the kernel? There is always the chance that the kernel memory can be hacked or even just written out to the page file. .NET even has a type called SecureString that constantly decrypts and encrypts its contents and prevents the data from landing up in the page file. The data is kept in unencrypted form for the least amount of time possible. Performance is apparently utterly rubbish but that is the price you have to pay.
You mean what should be sysadmins normal and everyday job - keeping systems safe and running - is an exception?
I'm afraid too many look for a sysadmin job just to watch porn at work without being caught by firewall rules and proxies logs... that's probably why they accepted a contract without clear provisions for overtime and off-hours availability....
Just get it right? Something like "It's like a tiny leak in a cars radiator" or "It's like an MI6 office worker going to the pub and ordering drinks for James Bond Super Spy, and forgetting to keep it secret."
Mechanics is not the necessary explanation given, just the fact that data is in the wrong place at the wrong time. That tends to be a simple explanation?
How about "The maker of your car's engine included a design flaw for the last ten years. They've been caught & admitted it. The fix will rob your engine of up to 30% of it's total power. In other words it won't go as fast, it will get worse mileage, & will have to work harder to do nearly everything. Other professionals in the industry have called it (insert the full length version of the acronym here), a total mouthfull, but it'll be better known by it's nickname of Fartwit."
It breaks it down to sound bites the public can absorb, gives it a descriptive & catchy nickname, & gives anyone whom wants it the ability to research for more info.
I just wish American airwaves weren't so strangled by prissy nannies so we could actualy say aloud on the telly "They call it Fuckwit due to the level of utter stupidity involved in the design flaw. Intel knew it was wrong, knew it would bite them on the ass once it was brought to light, & decided to do it anyway because it made their numbers look better than their competition. Only now they're actually *worse* than the competition once the fix is applied, because any speed gains the flaw gave them are entirely destroyed by the fix required to stop it from fucking the rest of us over. Fuckwittery indeed!"
*Cough*
Fuck it. my next computer chip is gonna be a Dorito!
Rob Enderle (remember him from SCO Days...????) was on Radio 5's 'Wake up to Money'.
What he said was IMHO, meaningless gobbledegook (aka Marketing Triplespeak).
From what I've read, onlt certain ARM Cpu's are affected but he implied that all ARM powered phones were vunerable.
I must remember to switch off entirely when his name in mentioned.
Anybody at all "on the BBC" is likely to display utter ignorance, otherwise they wouldn't be working for the BBC or invited to speak.
If, by some freak of chance, someone who does know what he is talking about gets on air, you can trust the BBC drones to cut them off before they get fairly started.
How very odd that some who be down voters here do not realise the perilous state of parlous play which occupies and governs the BBC.
Just try to find anywhere on their virtual systems where you can voice an opinion on what they are pushing ....... you know, something as simple and effective as a comments facility which as El Regers will know provides hidden gems of information which prevent crass arrogant and ignorant brainwashing of the masses for a crooked see on worlds their programs would be presenting.
Anybody here old enough to remember the BBC good old days/better beta times whenever fab forums were aplenty and The Great Debate ruled the news airwaves and discovered all manner of shenanigans and offered virtually anonymous remedies for next day/next week presentations ....... which is probably why the BBC www platform is now practically neutered ...... for one can't have anonymous beings leading the news with their undeniable views whenever corporations are so reliant upon secrets being hidden and tall tales being trailed?
The lament is still valid, nkuk, for service provided today is but a sad shadow of its phormer self, and let's not get started on the non state of political comment during the fact.
Anyone would think there was an active conspiracy to mislead and/or hide the real state of affairs from that and those who would tuning in to turn on.
Anyone would think there was an active conspiracy to mislead and/or hide the real state of affairs
Whatever you do, don't listen to French radio news then. Anything which might show France in a bad light never gets a mention. It's worse than US TV.
"The BBCs article on the subject does have a comments section"
Whatever you do don't go there! It's filled with brexit remainers of the far right and left persuasions and is generally devoid of all intelligent life.
The BBCs article on the subject does have a comments section:
The BBC have done several articles on the subject; not all have comments sections:
Jan 3: Major flaw in millions of Intel chips revealed - No comments section
Jan 4: Rush to fix 'serious' computer chip flaws - Has comments section
Jan 4: Intel, ARM and AMD chip scare: What you need to know - No comments section
Jan 4 - 4 hours ago: Meltdown and Spectre: How chip hacks work - No comments section
I think you will find that the majority of articles on the BBC don't have comments sections.
"Anybody at all "on the BBC" is likely to display utter ignorance, otherwise they wouldn't be working for the BBC or invited to speak."
There was a Reg guy interviews on BBC R4 at lunchtime. He seemed to be struggling with explaining it in a way Joe public would understand. He didn't sound too experienced at being interviewed, but then I suppose he's paid to be the interviewer, not the interviewee :-)
displayed utter ignorance
I've yet to see a mass-media[1] 'technology' journalist that actually knew anything about actual technology. As far as I can see, most of them are just breathlessly regurgitating manufacturer press-releases..
[1] El-Reg doesn't count as 'mass media'..
It goes a long way back. If the CPU has:
* a Memory Management Unit
* a memory cache
* a branch predictor
* Supervisor & User modes
It's highly likely to be affected. (and the last one might not even be required)
All of those features have been in CPUs for many, many years.
There are mixed signals because there are two different exploits. One affects Intel and not AMD, allows reading arbitrary kernel memory, is not difficult to exploit, and the fix has a performance hit which is larger the more often your CPU enters/leaves kernel mode (i.e. syscalls, driver interrupts, context switches, etc.) This is the one that hit the media in the last couple days.
The second one affects both Intel and AMD, allows a process to read its own memory and while it is very difficult to exploit the prospective fixes have almost no performance impact. Normally you wouldn't care if a process can read its own memory, except in cases where it is trying to run a sandbox - like a browser running Javascript. Rogue Javascript code on a web site could access the browser's memory, though with modern browsers using multiple processes that's less of a concern than it would have been a few years ago. Given how many browser exploits there are that can do much worse, this isn't bad for home users.
The problem with the second exploit is they aren't sure if it can actually be completely fixed, so it may be something we just have to live with. If so we can always hope this it will be the death knell for Javascript like Steve Jobs was the death knell for Flash!
My memory may be a bit weak in the management areas due to lack of coffee, but AFAICR:
* a Memory Management Unit - everything after the 8086
* a memory cache - everything after the 80386
* a branch predictor - Probably Pentium 1 and up
* Supervisor & User modes - everything after the 8086
I think there is slightly more to the story than what you said. Specifically, the issue
depends on how the MMU works, and how it is used.
I have not been involved in CPU design for over 30 years BUT:
I would not expect user mode code to have any way to be aware of the MMU's internal
operation.
* The MMU should disallow access to all virtual pages not in use by the current task.
* Addresses in the current user address space not mapping to physical memory should map to either a virtual address saying "illegal access" or to one saying "You will need to swap me in before you can read me"
* there should be NO way to access physical memory that does not go via the MMU - not even for speculative instruction or data fetch.
The bug reports seem to describe noticing that a speculative fetch that goes unused causes a delay which can be used to identify the value of data FROM THE DELAY. I dont understand this. If the speculative address names is not in cache, then how is fetching it speculatively justified?
Conclusion - This is not MMU - this is cache management - which SHOULD do a similar thing to what the MMU does BUT ISN'T DOING IT. The bug is (partly) that you can read data in the cache that is not yours. This is not really a risk UNLESS: There is some way to find out whose it is.
While MMU pages are normally 4k bytes, cache lines are more like 16 bytes. Fetching 16 bytes from "somewhere", with no way to find (or control) which page of whose address space they belong to is not a significant risk, although obviously undesirable. In normal circumstances, your next attempt to do this would probably fetch from a completely different page in a different task.
Clearly we are not being told the whole truth here.
It seems more like there is a way to FORCE the caching of other people's address spaces and make that visible to you. That gives you security on a level with a Commodore Pet. If so, then yes, Intel may have to replace every CPU since on chip caching (probably Pentium 1).
The bug reports seem to describe noticing that a speculative fetch that goes unused causes a delay which can be used to identify the value of data FROM THE DELAY. I dont understand this
The results of a speculative fetch of a bit can be used to decide which of two other addresses, accessible to the normal user, are read from while still in the speculative execution branch. Whichever one is chosen will end up in the cache, even if the actual results are discarded.
Then, after the incorrect speculative branch work is discarded and the program control returns to the user, it reads from both those addresses and times the operations. The one that returns quickly is the one already in the cache, which is the value that was read during speculative execution. That, in turn, allows you to determine what the value of the speculatively-read bit was, even if you can't actually read that bit from the user program. Over time you cycle through the whole kernel address space, bit by bit.
That's my reading of the PoC, anyway.
I realise that El Reg has skin in the game on this one, but there's so much stuff out there just waiting to be mauled. Uber's humblebrags offer a prime target; AirBnB emitted something that made my stomach heave yesterday … Disruption? Hah! All these outfits disrupt is my digestion.
"Was it written by AManfromMars under an alias?"
A single line comment by AMFM today on the previous article for this topic was very lucid.
Ve said "Just a coincidence, A Non e-mouse ........ not."
We made it go mainstream. Our Tuesday report was the basis of Bloomberg, Reuters, NYT, CNBC and BBC coverage - we were even cited and linked.
I dunno how many people saw your speculation pre-Tuesday but our articles this week are seven-figures in terms of page views.
C.
@Diodesign
Sorry if this is just me being an idiot, but from what I've seen, there was a planned release date for the flaw on the 9th which has been brought forwards due to articles released on El Reg earlier than that. Isn't making it go mainstream before this date kind of a bad thing?
Isn't making it go mainstream before this date kind of a bad thing?
Depends how you define "bad", and for whom, I guess. Their articles this week are seven-figures in terms of page views and they get to write snarky articles about Chipzilla, so I'd say it's working out pretty well over at Reg World HQ. Triples all round.
I enjoy watching Intel get stiffed for insecure design and PR waffle as much as the next person. I also respect The Register for holding the industry to account, and not just on this issue. But it'd be interesting to be told why they decided to publish early. Not that I expect we will be.
It was going mainstream anyway, The Register were just first to that party.
Basically, as soon as Linus revealed that there was a kernel patch that would have a notable performance penalty, the whole thing was going to be exposed.
Apple + MS reporting the same? not so much as it is closed source, but as Linux is open source, any changes are in the public domain as it were.
To the best of my knowlege, The Register didn't sign any NDA.
"Basically, as soon as Linus revealed that there was a kernel patch that would have a notable performance penalty, the whole thing was going to be exposed."
Not to mention the fact that as soon as a FOSS patch is released, the code is available for all to see and read, albeit in this case with the code comments redacted to make it a little more difficult. The various parties involved may have agreed to a dated embargo, but the source code can't be held back till then.
We asked Intel what was going on, twice, and had no response - not even a no comment, or an off-the-record explanation. We were certain with what we had - given the LKML discussions and information from other sources - so, why not warn the world that big changes are coming?
We offered no exploit code. Just a heads up that important alterations were being made to crucial bits of software. It's not our job to do companies' PR. We can't read minds.
And these changes were being done in the open, so any bad people paying attention could have known what we knew or more, and started exploiting it.
A lot of vendors hold us at arm's length, hoping we'll go away. We regularly get the silent treatment from various - but not all - companies. We're not going to sit on stories just because we get a no comment/no reply. Turned out this one was quite a big one. We had no idea it would be this big.
C.
@diodesign thanks for taking the time to give some more background. I was wondering if it'd turn out to be vendors not treating the news/enquiries with enough respect (in hindsight).
<Papa loads his shotgun, sighs, shakes head, and takes the PR department behind the woodshed>
>We made it go mainstream. Our Tuesday report was the basis of ... and BBC coverage
Recommend from listening to the BBC Radio News interview of the El Reg reporter, that El Reg invest in some verbal communication and media skills training. The poor guy obviously wasn't used to talking to the media, but well done for standing up and doing the interview.
Intel and other technology companies have been made aware of new security research describing software analysis methods that, when used for malicious purposes, have the potential to improperly gather sensitive data from computing devices that are operating as designed.
Translation: When malware steals your stuff, your Intel chip is working as designed. Also, this is why our stock price fell. Please make other stock prices fall, thank you.
You did better, or well, on translation of other bits of canned statement further down, but the one above is simply a mistranslation, and not even that funny. I'm disappointed, because I expected better, i.e. BITING and SUBTLE humour, not cheap sarcasm. More a nibbling vulture, than a ripping pigeon...
This post has been deleted by its author
'Chipzilla doesn't want you to know that every Intel processor since 1995 that implements out-of-order execution is potentially affected by Meltdown – except Itanium'
Well, yes. That's because Itanium DOESN'T implement out-of-order execution. That was its whole USP, that the compiler would do the instruction ordering not the CPU.
Keep seeing that gamers will be unaffected. But surely that only applies to local gaming? Any online multiplayer server will be hitting the I/O hard, and your own network card gets a fairly good workout on a 64 player server?
Bench-marking multiplayer modes is always difficult (as you cannot standardise the test) - and whilst local frame rates may not be impacted, latency will be?
While the americans search around for some way to blame the Russians or Iran from this I was thinking that there may be an upside:
As each app is carrying around a copy of the kernel this will use a lot more memory, so splitting them off should result in a single copy of the kernel which should in theory reduce memory usage and help the cache - which should in itself speed up the OS, or at least compensate for the slowdown.
Any thought?
This is clearly a security flaw that many people could have spotted and just didn't. HOWEVER I do believe that this calls for a serious investigation. If there was internal research at Intel or ANY chip manufacturer that revealed this class of flaws and covered it up, this would prove legal liability.
This could be incompetence on the part of Intel. Sure. But all the information needed to spot this flaw has been out there for some time. The real question we should be asking is: did they know about it and intentionally cover it up FOR SELFISH REASONS. Withholding details of a flaw for a short time — until a fix, workaround, or other mitigation strategy is devised — is standard industry practice. But if they covered it up just to save their own asses, then they should pay the price.
> ...affect online/multiplayer gaming?
Unlikely. The additional delay (even under pessimistic assumptions), when added to every UDP packet, should not be noticable. Properly programmed games should work smoothly even if the delay gets noticable, being prepared for network latency.
This will hurt processes with _massive_ amounts of syscalls. Think "find / " and data bases.
Would using kernel bypass mechanisms like DPDK, Netmap, etc. mitigate the impact to networking-heavy applications to any extent? Or do the patch fixes need to be installed in all cases, and will still impact performance, regardless of whether you use a kernel bypass mechanism or not? This is mostly a Linux kernel question, just trying to understand the dependencies and whether any mitigation is possible...
You'll need to install the patches anyway just to be safe; but yes, kernel bypass would mitigate the effects of calling the kernel being made more expensive. I'm not sure how much effect kernel bypass would have on virtual machines like AWS though. Maybe someone here could comment, I'm interesting in faffing about with it.
Thank you for filling in the blanks as to why my recently upgraded (to High Sierra) mac book air is now running like a dog and going through the battery in a couple of hours (this is a huge change from before the upgrade when I could comfortable get 5/6 hours from editing using the photos app which was easy and quick to use).
Apple support are not interested. My mac book air is still under 1 year warranty so I can get a full refund on that (presumedly?) Does anybody know if I will be able to get a full refund for an over 1 year old mac book pro also?
I think I might return both if possible - anybody tried this?
Apple support thread...
https://discussions.apple.com/message/32813718?start=30&tstart=0
OK - as a home user, here's a couple of data points for you to consider.
MS have issued the patch for Windows 10. which takes you from build .125 up to build.192.
I ran a handbrake video conversion before and after and also ran the passmark test before and after. This was on my I7-3770.
Handbrake. Before: average FPS 168. Time taken - 18mins.
Handbrake. After: average FPS 167.5. Time taken 18 mins 20 seconds.
Passmark. Before After
Total 3219.7 3228.7
CPU 8214 8224
2D 557 561
3D 3585 3605
Mem 1752 1758
Disk 2444 2409
So, the only thing that seems to have suffered is disk I/O and that by around 1.5%
YMMV - this is just what I found.
Well, I purchased mine from Apple only 6 months ago (decent spec), but I think it is the same basic model that has been in existence since 2015 or so- so maybe the impact on my machine is more than on other more recent machines. The performance hit that I've seen is app slowness, quick drainage of battery, and also excessively warm laptop on lap...
It really is a pile of sh*t now.
But that's what I'd expect for something like Handbrake which is not going to be hitting the kernel much. Written efficiently, it will load a chunk of video in from the disk, work through that, spit it out and then load another block. In the broad scheme of things, only that disk access is going to suffer, and that's (simplistically) once per block read or block write. It's not going to be hitting the disk on a frame-by-frame basis, and the kernel's not going to be used for converting the frames.
Thanks but how do you know? My new mac book air has become unusable overnight. An apple support thread indicates this is a common problem. No-one has had any response from Apple. How can I trust the benchmarks - who is producing them?
It may be a combination of the intel patch and also lack of care in the upgrade process (for OS and app) of course
I know by reading the literature. One of the key parts of confirming an issue is to only change one thing at a time, and you've upgraded your entire operating system. The likelihood is therefore that the OS upgrade is the cause of the issue.
To be sure, downgrade your OS to the initial release of Sierra and benchmark performance. Then upgrade to the latest Sierra patch and run exactly the same benchmark.
If performance at this point nosedives, complain to Apple. Otherwise upgrade to High Sierra. Run the benchmark again. Apply the latest High Sierra patch and run the benchmark again.
I strongly suspect upgrading to High Sierra will be your issue.
Oh look - you can only get the MS patches if your AV vendor stops making unsupported kernel calls, otherwise the patch will Blue Screen your machine. https://support.microsoft.com/en-us/help/4072699/important-information-regarding-the-windows-security-updates-released
Not to mention that the only patches available as of today are for Win 7, 8 and 10. No server patches at all. It's going to be a fun month.
> Oh look - you can only get the MS patches if your AV vendor stops making unsupported kernel calls,
> otherwise the patch will Blue Screen your machine.
Well, duh. You dig around in the kernel and call bits of it, your code is going to be very unstable. At least MS have done something so that the users won't unexpectedly be nuked (or at least no more unexpectedly than normal). It's probably rather hard to apply subsequent patches if your system keeps blatting itself because the AV program checks the subsequent patch...
From my days of teaching Software Quality Assurance, over 70% of bugs in shipping production code were built into the design at the beginning. IIRC the intent of methods like Extreme Programming was to help catch many of these design flaws by including representatives of the “customer” in the design team and using iterative design.
There is no reason to expect that the hugely complex chip design process is very different, even though they must of necessity be much more rigorous in their design process and re-use existing modules extensively. This latter allows each module to be debugged independently over time. But the interactions between modules that is a critical performance factor in the modern chip designs must be extremely difficult to 7nderstand, much less account for in the higher level design process. And the chip design process today looks a lot more like software than hardware. Designers must depend on their CAD system (another beast of high com0lexity and its own bugs!) to correctly manage the low level interactions.
For regular software, the statistics show that if you are using reasonable design methodologies, in shipped production software there is roughly one bug in every 200 lines of code, regardless of the language. (The difference between low level and high level languages was strictly in the impact of a given bug, not the probability.) But about 10-15 years ago MS mentioned they apron about one in 70, I suspect due to their practice of hiring young hotshot SW people who had not learned defensive programming. And again, mod5 of the remains bugs in shipping code came from the design.
Perhaps most scary: less than 50% of the remaining bugs were likely to be discovered in black box testing.
The basic flaw seems to be that the processor allows the cache to be updated by out-of-order and speculative execution before knowing whether that path will be taken and committing the register updates. Seems like anything fetched by out-of-order and speculative execution should be held outside of the cache and only committed to the cache causing a cache update when the path is known to be taken and the registers are committed.
Intel believes these exploits do not have the potential to corrupt, modify or delete data
The problem with spraying marketing bullshit at people familiar with expressions like "corrupt, modify or delete" is that we're not quite as stupid as the general consumer. We know bullshit when we see it -- possibly thanks to the past 25 years of this sort of crap.
Come on, Intel. Sir Humphrey would be embarrassed by this sort of crap. Don't insult us; we expect a much better class of bullshit than this.
In the wake of the Equifax ballsup a few months ago, some Directors of that company caught quite a bit of flak for the obviously innocent selling of stock at $149 just before it crashed to $109.
The Directors and other senior players in Intel are also honourable men and women. I doubt that any of them have cashed in on their stock in the period between the discovery of this mess and the publicity of the last days. I don't think this would happen.
First, Thanks to The Register for breaking this story ahead of the "coordinated disclosure."
Second, Thanks for the translation of Intels response, I needed that!
These kind of things gets me to the boiling point. The behemoth-corp that basks in self fabricated awesomeness when presenting their latest and greatest then revealing a revolting lack of spine and honesty towards their costumers when something like this occurs. The very same costumers they've built their entire wealth on. Taking a stance like the costumers were their subjects or some cattle that just should get back into the fold. Not a hint of self criticism or acknowledgement of the manufacturer/costumer relations that is the base of the entire operation! Show some dignity, it's good PR, I can guarantee you that!
Now, I'm not pretending to be knowledgeable in these matters but calling this a bug seems to me like seeing King Kong hanging of a building and saying -Look, a bug. Pentium FDIV was a bug.
Lastly, the question (and I've seen this floated here and there in various forms). Was this done intentionally?
This kind of flaw would be any surveillance agencies absolute nirvana, yes?
Since there is no means to patch the security violation architecture used in Intel CPUs the real question is will Intel be held accountable for choosing to violate roper security design practices to gain a minute increase in CPU performance while literally endangering the entire world? Intel knowingly chose to violate proper security procedures.