back to article Intel admits Skylakes can ... ... ... freeze in the middle of work

Intel has confirmed it's pushing out a BIOS fix for a bug that can freeze its Skylake processors. The good news is that the bug's triggered by complex workloads. It was turned up by prime number experts the Great Internet Mersenne Prime Search (GIMPS), who use Intel machines to identify and test new large prime numbers. A few …

  1. Anonymous Coward
    Anonymous Coward

    A 'bug' involving prime number maths eh?

    Guessing that Skylake's microcode backdoor tripped up on all the prime multiplication ops and ran out of space to store copies of P and Q on the management engine secret flash.

    1. Steve Davies 3 Silver badge
      Coat

      Re: A 'bug' involving prime number maths eh?

      Brings new meaning to getting your P's and Q's right.

      Mines the one with a copy of 'Laplace Transforms (1st ed) bt K.A. Stroud' in the pocket.

      1. Ken Moorhouse Silver badge

        Re: P's and Q's (or P's and S's)

        Interesting you should mention P's and Q's and Stroud. My recollection (correct me if my memory fails me) of Laplace Transforms is that everyone uses "s" except people doing engineering who use "p". (Stroud's books are geared more towards an Engineering readership).

        1. Steve Davies 3 Silver badge

          Re: P's and Q's (or P's and S's)

          Have an upvote.

          I would not have got my Mech Eng Degree (1975) without that book and the equally excellent 'Engineering Mathematics'. I still have my 2nd Ed of that esteemed work on my bookshelf.

        2. Martin an gof Silver badge

          Re: P's and Q's (or P's and S's)

          Ugh. Stroud. Had to wade through some of his books doing engineering in the 1980s too. The main problem we had was that the mathematicians all used "i" for complex numbers while we used "j" to avoid confusion with Amps. Our maths was taught by mathematicians from the maths department and things did occasionally get a bit confused. Not helped by the "clever" calculators just coming on the market as the decade turned with dot-matrix displays...

          M.

          1. allthecoolshortnamesweretaken

            Re: P's and Q's (or P's and S's)

            Had similar problems around the same time - profs from the math faculty teaching would-be engineers. When worlds collide... However, my Casio somethinsomething (replaced by a HP48sx as soon as possible) and comrades Bronstein / Semendajew / Musiol / Mühlig helped a lot to get me through this.

            1. parperback parper

              Re: P's and Q's (or P's and S's)

              Of course if it was a programmer, and not a mathematician or an engineer, they would have meaningful variable names and not use letters as though they were rationed.

            2. Anonymous Coward
              Anonymous Coward

              Re: P's and Q's (or P's and S's)

              We had the same problem only years earlier. A Pure Maths teacher trying to teach to people who were very applied. Most had come back into education after often years in Industry. The Pure maths textbook was just gobbledegook to most of us.

              Enter 'Engineering Mathematics' and suddenly things became a lot clearer.

              The Maths lecturer soon grasped that he had to teach us differently to those studying Maths.

              One size did not fit all.

              This was at a Polytechnic. Now that they are all universities it is a lot harder for people to take that route.

              I know that I would not have had the career I did if I'd started out even in the mid 1980's.

            3. GrumpenKraut
              Thumb Up

              Re: P's and Q's (or P's and S's)

              Now have an upvote for mentioning Bronstein / Semendajew. Good stuff!

          2. Anonymous Coward
            Anonymous Coward

            Re: P's and Q's (or P's and S's)

            beat me too it! I was going to mention i and j for complex numbers. I was taught math at college BTEC Electrical Engineering by an ex chief in the Navy so he used j Bleedy good at math used to do sums in his head to 3 decimal places!

      2. MyffyW Silver badge

        Re: A 'bug' involving prime number maths eh?

        Ah - Stroud, or "Engineering Mathematics for Dummies" as it would probably be called now, got me through the seriously tedious world of partial differential equations.

        Always loved the tone in books 1 and 2, I think he slipped a bit in the Laplace Transforms tome and the Fourier Series book sent me catatonic.

    2. allthecoolshortnamesweretaken

      Re: A 'bug' involving prime number maths eh?

      Naw, this is just a fuckup. You give 'them' too much credit - if they were the evil geniuses you seem to think 'they' are, 'they' would have made damn sure anything involving prime numbers wouln't trip anything undesired.

  2. 45RPM Silver badge

    Does the problem also occur with EFI or is it strictly a legacy BIOS issue?

    1. Anonymous Coward
      Coat

      Re : BIOS issue?

      It's NOT a BIOS issue... it's a processor's microcode issue...

      Your BIOS basically uploads a microcode update to the cpu and by updating

      the BIOS you update the version of the microcode it contains...

      1. Ilgaz

        Re: Re : BIOS issue?

        In reality it will be kernel developers like Microsoft, Linux who will update the microcode as people are rightfully afraid to update their BIOS.

        It is 2016 with 3-4 BIOS development companies at most, still no safe, documented standard to update BIOS.

        1. bcran

          Re: Re : BIOS issue?

          There's no documented standard? I guess you don't know about the Firmware Management Protocol for UEFI systems? https://blogs.intel.com/evangelists/2015/06/23/better-firmware-updates-in-linux-using-uefi-capsules/

      2. 45RPM Silver badge

        Re: Re : BIOS issue?

        @malle-herbert

        Of course it is. I'll get my brown paper bag. My parser is clearly a bit off this morning.

      3. Charles Manning
        Boffin

        Re: Re : BIOS issue?

        "Your BIOS basically uploads a microcode update to the cpu and by updating"

        Picky, but....

        I'm pretty sure the microcode is not so much upgraded as patched and that patch needs to be applied on every boot.

        The hassle with microcode patching is that it likely runs far slower than the native (hardwired) microcode it replaces (as a bit of a stretched metaphore, think software floating pt emulation rather than hw floating pt) . That's likely going to make these processors suck for number crunching of the form that revealed the bug.

  3. Mike 125

    BIOS?

    I'm out of touch - the BIOS can now update CPU microcode? F'k me, is nothing sacred? Real security would appear to be an impossible dream with arrangements like this.

    It seems that the CPU makers are jumping on the same ship the OS makers have been on for years: "We push out the crap, and the customers find the bugs for free."

    1. psyq

      Re: BIOS?

      BIOSes (and UEFI firmwares) could update Intel CPU microcodes for >years<, ever since Pentium Pro.

      Threat to security of the microcode update process is quite low since the microcode update is checked by the CPU itself prior to update and will be rejected if the signature fails. Unless you get access to Intel's private key, you cannot do anything useful with the microcode BLOB, except to try sending it to the CPU and watch the process fail unless the BLOB is unchanged and newer than microcode currently "running" on the CPU.

      And, anyway, after a cold boot CPU just reverts to the original microcode which was stored on it during the specific stepping production process. BIOS/UEFI then updates the uCode with its latest version followed by the OS, which typically has the latest.

      This is nothing to say that Intel has royally f*cked up with this bug, but there is no justification for blaming microcode update for being a security liability. It is not.

      1. Anonymous Coward
        Anonymous Coward

        Re: BIOS?

        "Unless you get access to Intel's private key, you cannot do anything useful with the microcode BLOB"

        Right, because as we know systems such as these are infallible.

        1. psyq

          Re: BIOS?

          I did not state the systems are infallible, but at least in Intel's and AMD's case (as far as we are talking about x86 world only), there are no known faults with the microcode update implementations. Absence of evidence is not evidence of absence, but so far, uCode has been proven safe and not because of lack of trying.

          I cannot say for sure, but I would place my bet that Intel >really< tested the microcode management part of the silicon. Maybe not primarily for the benefit of customer security, but because of their own business.

          Apart from the microcode used for bug fixing and, sometimes, implementing future instructions (such as AVX2 GATHER in Haswell) there is also a part which typical client almost never sees - stuff used for debugging and feature enabling/disabling / operating point control. With those, it is possible to have more control over the CPU compared to what the "normal" MSRs can do and enable facilities which are "not there" as per model information.

          And >that< is protected with the strongest cryptography, for the manufacturer's own sake. Basically, CPU is controlling itself in this case, and without passing signature checking it will refuse to do anything with the blob, and you cannot "force" it from the outside, other than either:

          a) Breaking whatever encryption Intel/AMD are using, which I am sure is the strongest available

          or

          b) Physically manipulating the CPU by cutting the package and doing in-silico modification. Let's forget the part where multi-million equipment is needed and such CPU would last only few hours, this method is hardly undetectable

          or

          c) Finding and exploiting a bug in uCode validation procedure on the CPU. I doubt this is realistic, since such procedure can (and probably is) made with simple and mathematically provable code. Maybe this is way more likely than a), but I really doubt in its existence.

          I do not know for sure, but I would be willing to bet that authentication and checking of the microcode and operating point protection is most likely most audited and checked part of the CPU design :-)

          If I am to think of ways to "exploit" Intel or AMD x86 CPU, uCode update would be very low on my list.

          1. patrickstar

            Re: BIOS?

            Old Intel microcode updates are just using some symmetric crypto. If you were really hell-bent on it and had a huge budget, you'd probably be able to grab the keys off the silicon.

            Old AMD microcode updates aren't even encrypted.

            Neither poses much, if any, security risk however as the size and abilities of the code are both VERY limited. You have a (low... 16 or so?) number of patch registers to override locations in the microcode ROM with some (small) amount of microcode instructions and that's it.

            Certainly nothing that can be used as a remote backdoor. At most, with an insane amount of work, you could - possibly - use it for a local ring 3 => ring 0 or VMX escape. So this is nothing like eg. ME or other truly evil backdoorable "features".

            And then there's the small issue of persistence. Microcode updates do not persist across resets - they are loaded by the BIOS and OS on each startup. If an attacker can modify BIOS/EFI or OS drivers, there are countless well-published ways to hose you that are undetectable from the running system, and none of them involves microcode updates.

      2. Mike 125

        Re: BIOS?

        @psyq

        >>Unless you get access to Intel's private key...

        This was the cry of the MiFare access card maker, until someone *got access to the private key*, (admittedly a whole different technology).

        It's always the pitiful cry.

        1. psyq

          Re: BIOS?

          There ways to make this extremely hard and unlikely.

          Whether Intel adheres to such practices, I cannot say for sure. But considering that their multi billion dollar business literally depends on this, and the company does not have a lack of excellent security and engineering talent, I would say that the process is probably as secure as it could be.

          Microcode is a lousy place for hiding rogue code anyway, since it gets reverted to "factory" code after every reboot, so you have to exploit the system firmware or OS and update the microcode after every reboot, and if you already got there, you probably already have what you need, there is no need to invest millions into trying to "crack" a CPU and redo such job with every new process shrinkage / generation change.

          Also, microcode is almost certainly tied to the particular architecture, which means it changed at least every 2 years.

          It does not make too much sense from the cost/benefit point of view.

          I would be more worried about system firmware. Many of the modern PC / notebook motherboards are using system firmware implementation which was at least on TianoCore UEFI code. If somebody smart found a good hole there, it is likely that such hole could be exploited on multiple generations of system boards since big part of that code is platform-independent and probably not touched too much by firmware vendors.

          Of course, since at least Haswell (and, I think, at least Sandy Bridge EP) platform, there are ways to prevent this by forcing hardware validation of the firmware images (making updating patched UEFI images impossible), but such things were rarely enforced and probably not even wanted in some parts of the home market where it was/is desired to be able to "patch" UEFI image for software piracy reasons.

          But, at least in theory, it would be possible to make this process very >very< hard by insisting on hardware validation which cannot be overridden in software at all. This is still less secure since with many OEMs there is a higher chance that a private key leaks, which is why I would prefer to have a system with a jumper which prevents >any< firmware update, but this seems too much to ask today :(

    2. Roo
      Windows

      Re: BIOS?

      "It seems that the CPU makers are jumping on the same ship the OS makers have been on for years: "We push out the crap, and the customers find the bugs for free.""

      Loading microcode at boot up isn't a new thing, one of the steps of booting a VAX-11/78x was loading the microcode... The microcode was even documented so you could cook up your own - and some people did. Note: VAXen weren't the only big iron boxes that loaded ucode at boot time. :)

      1. Roo
        Windows

        Re: BIOS?

        From "PALcode for Alpha microprocessors", published May 1996:

        "In some architectures, microcode handles these hardware functions, but

        the Alpha architecture is careful not to mandate the use of microcode for

        reasonable chip implementations."

        Sigh... I quite liked PALcode...

      2. Charles Manning

        VAX microcode

        VAX even changed its microcode with one or more of the OS upgrades. The new OS came on some tapes with a wee box containing the new microcode in EPROM.

        IIRC one of the Burroughs machines flipped microcode on the fly depending on the process executing. That allowed it to use different microcode (eg. different instruction sets), for, say COBOL vs FORTRAN programs. Pretty neat trick.

        1. Blue Pumpkin

          Re: VAX microcode

          I also remember there being a bug in the VAX hardware on the Venus machine (can't remember the number - 7300 or 7400 maybe) and some bright spark reprogrammed the microcode on-site to work around it.

          Plus some recollection of something called MEEP which basically converted an ICL 2900 machine (aka MU5 V2 :-) ) into a 1900 machine so that it could run GEORGE 3 on newer faster hardware with better performance than the native VME operating system.

      3. Eltonga
        Pint

        Re: BIOS?

        Actually, real big iron (IBM Mainframe) did that too :)

        A toast to good old big iron!

  4. David Austin

    At least it gets the right number eventually.

    That makes it better than the old skool Pentium FDIV bug.

  5. Anonymous Coward
    Anonymous Coward

    Can't remember doing 88 mph this morning...

    ...but it seems I've ended up in June 1994 today, and I will soon be receiving a notice from Intel detailing how to RMA my Pentium 90 because of an FDIV bug.

    1. Anonymous Coward
      Anonymous Coward

      Re: Can't remember doing 88 mph this morning...

      I think the difference is these days the maths capability of the CPU is just some software that gets loaded into the CPU. I remember this sort of thing from the 68060 and 68040 days, functionality removed from the CPU and loaded in as code.

      Transmeta took it to the max by making all of the x86 instruction set load into a CPU.

    2. Jay 2

      Re: Can't remember doing 88 mph this morning...

      I recall it was the 60 and 66MHz Pentium chips that had the problem. The 75 (which I had) and 90MHz should have been OK.

      1. 404

        Re: Can't remember doing 88 mph this morning...

        My memory must be failing - was thinking that happened around the time Computer Shopper had the headline 'Is The Pentium 166 All a Business Will Ever Need?'....

        Note to El Reg - there is not an icon for 'You kids get off my lawn'....

      2. Anonymous Coward
        Anonymous Coward

        Re: Can't remember doing 88 mph this morning...

        I don't think I ever had a P60, I was still on a 486-DX2 then. I'm pretty sure it was a 90 (I couldn't afford a 100, or didn't want to. This was also a time when rumours surfaced you could easily clock a 90 at 100 'if you had a good one'). The WIKI also mentions the relevant steppings, but as with all WIKI's, YMMV.

        1. heyrick Silver badge

          Re: Can't remember doing 88 mph this morning...

          I had a Compaq Presario with a 75 MHz Pentium and a little heatsink on it. I replaced it with a larger fan assisted one and fiddled some links on the motherboard to up the processor to 90MHz. No problems at all.

          Nice to see the art of drop dead simple overclocking is still alive with the Pi.

    3. EddieD

      Re: Can't remember doing 88 mph this morning...

      I am Pentius of the Borg. Division is futile, you will be approximated.

      If memory serves me.

    4. CrazyOldCatMan Silver badge

      Re: Can't remember doing 88 mph this morning...

      > Pentium 90 because of an FDIV bug

      I was working at MotRot at the time and battling our 'technical architect' about which processors to use - AMD or Intel. He eventually decided (in his infinitie wisdom) that we would use Intel because 'they have never had a problem and you always know what you are getting'. The next day the news of the FDIV bug came out..

      Oh, how we laughed.

  6. hugo tyson
    Mushroom

    0xE40001

    is that exponent in hex. Interesting. One might wonder if it's all exponents of the form 0xNN0001 (NN > 0)

    1. Anonymous Coward
      Anonymous Coward

      Re: 0xE40001

      Falls at the first hurdle - the exponent is required to also be a prime to be a Mersenne prime. 0x110001 isn't prime (divisible by 3 for starters)

  7. Alistair
    Windows

    Oh. F00F!

    *cough*

  8. Andy The Hat Silver badge

    FFT

    Bugger ... just looked up 'using FFT to multiply large numbers'.

    Why oh why do I allow my curiosity get the better of me? It's like mathematical spam - enticing the reader in with promises of mathematical benefits then confuzzles them totally with a random splurge from the symbol font table.

    Instead of dangling a fat mathematical worm on a hook can El-Reg in future use such phrases as 'using a method', 'doing a mathematical thing' or 'magically'?

    1. phil dude
      Boffin

      Re: FFT

      You'll be wanting these then.

      Vol 2 if I remember, "How fast can we multiply?"

      P.

      1. Andy The Hat Silver badge

        Re: FFT

        Donald Knuth? Haven't heard that name since I was keen!

        1. NotBob

          Re: FFT

          Obligatory reference to Mr. Knuth:

          http://xkcd.com/163/

        2. phil dude
          Thumb Up

          Re: FFT

          For those who like maths, and the computers that do maths, it is an insightful text.

          I'll bet it is not on any syllabus reading list....

          P.

          1. Michael Wojcik Silver badge

            Re: FFT

            I'll bet it is not on any syllabus reading list....

            You lose.

            I confess it took nearly five seconds with Google to find that.

  9. psyq

    We finally reached the stage...

    We had for a while with large consumer software: wait for the service pack 2 before buying.

    With Skylake, looks like we reached that phase with the CPUs as well, and the advice will be: "wait for the second / third stepping before buying, unless you want to be a beta tester".

    Anybody who has worked on the system/firmware level of the modern CPUs, or have seen/worked with the BIOS writers guide, knows that the complexity of the software needed just to initialize the CPU to the OS boot time has became staggeringly complex, probably more so than the entire operating systems of the late 90s.

    This is in no defense of Intel, CPU crapping while doing hard math is simply inexcusable, but it does not surprise me one bit considering the fact what the modern CPUs have become, there are so many things (PCIe controllers, GPUs, complex power management, multiple levels of cache and ring buses shared by different on-die peripherals, memory controllers, etc.) which could go wrong when various "weird" conditions are created.

    Of course, Intel should have found out about this on their own during the engineering phase, but seeing this just shows why the server CPUs need at least 1+ year of quality assurance in order to be "releasable" - maybe that is actually what the minimum quality should be, and what we are seeing in consumer space is just degradation of quality even in the CPU itself :(

    I am not optimistic in this regard, and to me it looks like we will be seeing worse - TSX fiasco was not just a fluke, these things are becoming more complex and market wants them sooner.

    In this particular case, HPC and financial industry was just lucky the bug showed before Skylake EP/EX platform was launched. I guess they would not be amused if their brand new Xeon chips crapped while doing heavy math.

    1. heyrick Silver badge

      Re: We finally reached the stage...

      "when various "weird" conditions are created"

      Doing complicated maths is "weird"?

  10. Gene Cash Silver badge

    Intel CPU microcode is shit to start with

    My i7-3770 will hang in about an hour after boot running anything, if you don't load microcode updates.

    I got the new motherboard and I was running Debian without the intel-microcode package which updates the microcode at boot (since the BIOS is bypassed and GRUB doesn't do it) so it was running what was on the chip.

    It kept mysteriously hanging, except an identical board/ram/cpu did the same, so it wasn't bad hardware. I installed intel-microcode & iucode-tool and then it was all fine.

    So apparently Intel CPU microcode is shit to start with.

  11. ici.chacal

    Okay, so...

    I have a Skylake CPU, so what do I need to do, if anything..? Hope ASUS puts out a BIOS update for my motherboard..?

    1. Charles Manning

      Re: Okay, so...

      If it is a private computer and you live in a country with good consumer guarantees, you might be able to send it back.

  12. PNGuinn
    Joke

    Light bulb moment ...

    So - How many prime Intel engineers does it take to change a light bulb?

    Enquiring minds etc ...

    1. Michael Wojcik Silver badge

      Re: Light bulb moment ...

      So we have FDIV in 1994, and this bug 21 years later.

      They're doing better than the software folks.

  13. Phil Kingston

    Seriously, how do I get a job getting PCs to calculate large prime numbers? Seems a pretty cushty deal.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like