back to article Amazon: Intel Meltdown patch will slow down your AWS EC2 server

Amazon AWS customers have complained of noticeable slowdowns on their cloud server instances – following the deployment of a security patch to counter the Intel processor design flaw dubbed Meltdown. Punters said that, since AWS shored up its infrastructure, and began rolling out its Meltdown-patched Linux in December, they …

  1. bombastic bob Silver badge
    Devil

    maybe it's time to re-consider server-side inefficiency

    maybe it's time to re-consider server-side inefficiency. that is, instead of bloating your server side with massive libraries consisting of scripted and interpreted lingos (say 'Python' and 'Javascript'), to INSTEAD go with C language utilities and CGI-based things for otherwise CPU-intensive processes.

    Yeah, that's a major infrastructure change, if you've invested a LOT of time in NodeJS or DJango.

    Additionally, if you're using an SQL database, you might want to consider an "efficiency re-architecture" to limit CPU utilization and I/O calls. As an example, check your 'outer join' logic to see if you're linking tables together in the primary filter query, specifically things that don't need to be linked until later. Even a 'select for update' could start with a filter that doesn't require linking any additional tables, if you design your database intelligently (and with efficiency in mind).

    [this would prevent a boatload of unnecesary I/O or networking system calls, where the "fixes" for Meltdown would impact performance the most]

    I've seen enough gross examples of lazy server-side code [and been tasked to fix it] for one lifetime, probably. But I'm sure it won't be the last. "Blame where blame belongs" for inefficient server-side code, because someone 'felt' that efficiency wasn't an issue. until now.

    1. JLV

      Re: maybe it's time to re-consider server-side inefficiency

      yeah, cuz coding websites in C will make things waaay more secure...

      And, are Django/JS even in the crosshairs of the heavy KPTI CPU losses? If they're doing I/O, would that I/O not happen pretty much the same way as a C program's would?

      P.S. NO RANDOM CAPS????? 2018 RESOLUTION???

      1. Doctor Syntax Silver badge

        Re: maybe it's time to re-consider server-side inefficiency

        "And, are Django/JS even in the crosshairs of the heavy KPTI CPU losses?"

        I think BB's point was that if you were losing performance from Meltdown mitigation you might be able to reclaim it elsewhere by optimising userland.

        1. JLV

          Re: maybe it's time to re-consider server-side inefficiency

          I get that, and I agree with him that many people run sloppy systems. But if you're competent, you've already optimized the low-hanging fruits in your stack.

          And if you're not, C/C++ are unlikely to work well for you ;-)

          No one really seems to know how the Meltdown mitigation approaches will perform:

          - after the various OSs have been tuned for them

          - for different profiles of real-world taskloads: web, CPU-heavy, network, file I/O... We have synthetic kernel-only benchmarks at -50-60% efficiency, and ~0% games, so hard to tell where we'll all end up.

          - what are the Meltdown risk profiles for paranoids (cloud providers who have to be) vs trusted platforms (systems where you know what you are running and where a Meltdown malware is already a failure in your systems to have it running in the first place.

          - lastly, after Intel has pulled their thumb out of their rectum and fixed their next-iteration silicon.

          Yet, BB already recommends C-based CGI. For real.

          1. Adam 52 Silver badge

            Re: maybe it's time to re-consider server-side inefficiency

            I write C++ REST services. Not for any performance reason but because I've never got on with all the Node/Java web development frameworks. As a bonus my memory footprint comes in at a still slightly porky 24MB but much better than the Dropwizard on Java versions the rest of the team make, which need a t2.micro (1GB) even to start and a t2.small (2GB) to do anything.

            1. JLV

              Re: maybe it's time to re-consider server-side inefficiency

              C++ REST? Respect.

              Node and JS are weird, but sometimes you try something really clever and you go "haha, I thought that might work" and it does. There's a clever language hiding somewhere in there. The main issue, to me, is JS's atrocious support for early error checking and exceptions and never quite knowing what type of object you are looking at, or what 'this' might refer to. That, and client-side JS is hard to unit test efficiently unless you really know what you're doing (I don't).

              For me, there's no overriding reason to code a backend in JS, its quirks are just not worth it. But it's not quite as clueless as people make it out to be. For frontend ES6+Vue are lightyears away from Java+Swing.

              Java has way too much boilerplate and ceremony and is way too opinionated and enterprise/framework heavy to my taste. I'd rather code in almost anything else though I understand it somewhat. Applies 100x to J2EE.

              I've peddled in C and like it. I'd love to look at Rust, Swift, maybe Go when I get a reason to. Surely 50 yrs after C and 30(?) after C++ we must have come up with better system languages than Java.

              1. Destroy All Monsters Silver badge
                Windows

                Re: maybe it's time to re-consider server-side inefficiency

                For frontend ES6+Vue are lightyears away from Java+Swing

                Swing has been out for some time now. Please use JavaFX for your GUIs. Maybe Griffon.

                As for ES6 beyong light-years ahead of anything.

                I dunno. This industry is fully of bad opinions.

            2. Destroy All Monsters Silver badge

              Re: maybe it's time to re-consider server-side inefficiency

              I write C++ REST services.

              Well, I hope they don't look all that different from the Java ones (except for the calls to "free", something that really should be out of the mind when doing REST services) otherwise you ain't using the libraries right.

              1. Das Schaf

                Re: maybe it's time to re-consider server-side inefficiency

                @Destroy All Monsters

                Maybe he is performing dynamic memory allocation in C++11 using smart pointers, preferably unique_ptr<>, so that explicit calls to "free" are not required and object lifetime management is achieved. C++ isn't Java and the class designs may well be very different depending on idioms used e.g. PIMPL idiom is sometimes useful in C++ where binary compatibility between versions is a requirement, not applicable in a Java implementation.

            3. bombastic bob Silver badge
              Devil

              Re: maybe it's time to re-consider server-side inefficiency

              "I write C++ REST services. Not for any performance reason but because I've never got on with all the Node/Java web development frameworks"

              and that's the kind of thing I'm talking about.

              as an extra added bonus, I did some work for a company with a poorly designed back-end, by adding C utilities that are called from the DJango framework. Upload processes that WERE taking more than a minute (due to cpu-intensive activity) were shaved down to a few seconds.

              But I didn't re-do all of the DJango stuff. That wouldn't have bought much of a performance change. What I _did_ do made a HUGE difference, mostly because I saw a lot of "the Python way" coded into the back-end. It implied, to me, that "the established way of doing things" is simply GROSSLY inefficient, and re-coding that stuff in something that _is_ efficient (CGI via C programs, or even Perl if it's simple enough) would buy you a HUGE performance boost (in many cases where CPU intensive operations were slowed down by anti-Meltdown patches).

              If it's not CPU intensive, you probably wouldn't see a change (yeah). So for THAT, who cares.

              also I've written simple web servers in C a few times. One fo them was an attempt to "genericize" something to fit on an Arduino - yes, a somewhat generic web server in under 30k of NVRAM, intended to let you configure an IoT device with a web page. down side, you can't really do anything with the Arduino because the web server code is still a bit too piggy, so I shelved it... [had to try it anyway in case it worked]. but I was able to change device parameters and store them in EEPROM (things like the IP address, fixed or dynamic) so there ya go.

              THAT being said, to *ME*, 'C' coding is probably faster (in a significant number of cases) than doing a bunch of stuff with BLOATWARE, with 3rd party library hell, just to fit it all into "their way of doing things".

              And it's that "3rd party library hell bloatware" that's slowing down the back ends WAY too much already, I bet.

              (and with dis-respect to the 'random caps resolution' comment from earlier: BITE ME)

        2. Kabukiwookie
          Flame

          Re: maybe it's time to re-consider server-side inefficiency

          I think BB's point was that if you were losing performance from Meltdown mitigation you might be able to "reclaim it elsewhere by optimising userland."

          You mean like replacing shitty devs that can't write efficient SQL if their life depended on it. That will never happen, because that other dev who can do that actually costs $$. You can have 3 shitty devs for the price of one good one.

          This is not a technical issue, this is a management issue (or a lack of having a competent one).

          1. mathew42
            Flame

            Re: maybe it's time to re-consider server-side inefficiency

            Hear. Hear!

            I've spent the last week looking at some horrible SQL stored procedures with needless outer joins, excessive use of views, temporary and updating individual fields in the temporary tables. Unfortunately I don't have the original requirements and I'm doubtful that the code was bug free. I'm not looking forward to UAT.

          2. wolfetone Silver badge

            Re: maybe it's time to re-consider server-side inefficiency

            "I think BB's point was that if you were losing performance from Meltdown mitigation you might be able to "reclaim it elsewhere by optimising userland."

            You mean like replacing shitty devs that can't write efficient SQL if their life depended on it. That will never happen, because that other dev who can do that actually costs $$. You can have 3 shitty devs for the price of one good one.

            This is not a technical issue, this is a management issue (or a lack of having a competent one)."

            This is how I interpretted his response. Although for different reasons that you've pointed out.

            There is a tendancy now, as we're in a "golden age" of server performance to allow sloppy coding and/or coding decisions to be made, which are hidden by the super duper speed of the server the application/code is running on.

            Back when I started in web development I was still on dial up and I didn't know many people who had ADSL or even ISDN. So when I coded websites they were optimised to load as quickly as possible on poor internet connections. Now though, while I still work in that mentality, there are developers who are either fresh out of university or get hard on's over the next new cool thing who load sites and applications with bloated shit that doesn't do anything more than make something pop up from the bottom of the screen. They've never had to develop for slower connections, or they simply think that because everyone should be on faster internet connections that their bandwidth can handle the bloat they're adding.

            But if you say that in a public place, like BB and myself did, those voting down who are offended the most are the ones most guilty of such crimes.

            1. bombastic bob Silver badge
              Devil

              Re: maybe it's time to re-consider server-side inefficiency

              "those voting down who are offended the most are the ones most guilty of such crimes."

              heh - yeah, I don't even consider the down-voting any more. howler monkeys and my personal fan club, mostly. except when I don't get ENOUGH of them (so I'll JAM this up with MORE caps-lock ha ha ha to see how many I can get from the fans)

          3. Anonymous Coward
            Anonymous Coward

            Re: maybe it's time to re-consider server-side inefficiency

            "You mean like replacing shitty devs that can't write efficient SQL if their life depended on it. That will never happen, because that other dev who can do that actually costs $$. You can have 3 shitty devs for the price of one good one."

            Like those that use Entity Framework's Fluent to create their tables instead of designing their databases properly in SQL?

            Been examining a database that was generated using Fluent.

            All text fields were NVARCHAR(MAX)... /facepalm

            They never put in maximum string length or set IsUnicode = False

            1. Nick Ryan Silver badge

              Re: maybe it's time to re-consider server-side inefficiency

              Like those that use Entity Framework's Fluent to create their tables instead of designing their databases properly in SQL?

              Been examining a database that was generated using Fluent.

              All text fields were NVARCHAR(MAX)... /facepalm

              They never put in maximum string length or set IsUnicode = False

              I have a few of those applications, however the gem is one of these that was also created with no referential integrity at all. Oh, and multiple discrete databases because, erm, just because ok?

      2. OldSoCalCoder

        Re: maybe it's time to re-consider server-side inefficiency

        OP isn't saying coding is more secure in C. He's saying C is more efficient, has less overhead than a scripting, interpreted language that loads a lot of unused functions. How many js coders say 'let's throw this in there because we may need it later'? You don't think the interpreter unpacks this shit? You don't think there's overhead involved in this?

    2. Anonymous Coward
      Anonymous Coward

      Re: maybe it's time to re-consider server-side inefficiency

      Azure VMs and SQL don't seem to have slowed measurably for us after patching for this.

      1. Anonymous Coward
        Anonymous Coward

        Re: maybe it's time to re-consider server-side inefficiency

        "Azure VMs and SQL don't seem to have slowed measurably for us after patching for this."

        Don't seem to have slowed measurably..? If it's measured then there's no need for speculation (whether there was demonstrated an impact or not). Fishy comment smells fishy.

        1. Anonymous Coward
          Anonymous Coward

          Re: maybe it's time to re-consider server-side inefficiency

          "Don't seem to have slowed measurably..?"

          Well a number of complex batch jobs and processes so far take the same time to run after as before.

      2. Anonymous Coward
        Anonymous Coward

        Re: maybe it's time to re-consider server-side inefficiency

        "Azure VMs and SQL don't seem to have slowed measurably for us after patching for this."

        I'm sure they are over on Bing Cloud.

    3. Adam 1

      Re: maybe it's time to re-consider server-side inefficiency

      @BB, I think that I understand what you're saying. That you may be able to compensate for double digit performance losses by being a bit more careful with the design.

      Believe me when I say that this is exactly the sort of tail chasing that software engineers the world over are trying to do to limit the side effects of these OS level (software based) workarounds. But there is only an incredibly small window of time for that analysis of optimisation opportunities and designing something to fit, get it tested and published for customers (who will want to do their own UAT before going live).

      And understand that the typical RDBMS is in the ballpark for a 10%-20% performance hit. If your typical load was 75% capacity, what you would have called well planned capacity last week will suddenly be 90%+ utilisation and in the real danger zone. Remember also that it is an exponential problem. If your transaction takes 12% longer, the lock contention is statistically much more likely to hit another transaction (think the birthday paradox). Then once you start getting a tipping point of deadlocks, even the retries cause problems.

      As much as I would like to think optimising gives an answer, for many it is going to be more $$ for bigger AWS or Azure plans or bringing forward capex.

    4. FuzzyWuzzys

      Re: maybe it's time to re-consider server-side inefficiency

      I'm sure there are some very good efficiencies to be made using C/C++ when the code is well optimized. There's the rub though. You may well be a shit-hot coder but for every person like you there are a 1,000 who can barely spell memory-leak, let alone understand how to code to ensure they never happen!

    5. hititzombisi

      Re: maybe it's time to re-consider server-side inefficiency

      All of that containerization will also take its toll. I can't believe a Docker server will not be harmed more than a pure app. Each kernel network call to jump between docker - rest of the world network barriers will be an additional cost.

      1. OldSoCalCoder

        Re: maybe it's time to re-consider server-side inefficiency

        The Meltdown paper I read specifically mentions Docker as being exploitable.

    6. thames

      Re: maybe it's time to re-consider server-side inefficiency

      I don't think that re-writing your web application in C is going to help much. It's system calls which are slowed down, which in this context means mainly I/O intensive tasks. The big hits will likely be in the database and web server itself, both of which will already be written in C or C++. They are also often running on different servers from the application processes anyway. If it wasn't worth financially while writing your web application in C before all this happened, it won't be now.

      You would probably be better off looking for ways to decrease the number of times you have to hit the database and to reduce the size and number of separate files in your web pages. In this type of optimisation having a language which allows for faster development would be an advantage.

      With Python by the way, the really CPU intensive libraries tend to be written in C to begin with. If not, there are often 'C' versions available. One thing that Python does really well is interface with C libraries. If you do want to look for CPU optimisations in this area, then looking at libraries is a good place to start and lets you have the best of both worlds.

      As for switching back to CGI applications, given how those work I wouldn't want to guess how they are affected by these CPU bugs without some actual testing. They have a lot of per-call system related start up overhead, which may get a lot worse with the new Intel fixes and completely overwhelm any actual processing they do internally. That overhead is of course why we moved away from them to begin with.

    7. karlkarl Silver badge

      Re: maybe it's time to re-consider server-side inefficiency

      This is painfully true. If the webbers stop making terrible tech choices full of bloat and sh*t, then we wouldn't need crap like docker, npm etc, etc, etc... just to manage their retarded dependency sprawl.

      This has the benefit that crap wont randomly stop working in the future if someone tweaks a npm string faffing library. You will gain an almost 100% deterministic end product.

      As for safety in C. It can be done with virtual memory (mmap, mprotect) and in C++ we have had RAII and smart pointers for many many years now, reducing the need for "VM languages" such as Java and C#. Worst case scenario, compile it up using cl/clr:safe, or em++ and you can run it on .NET VM / node.js

      https://learnbchs.org

      Something like this could have been the perfect solution if webbers had stopped being crap over 10 years ago and learned the correct solution to their trade.

  2. JLV

    where do database queries sit in the %CPU loss continuum?

    I've heard of gaming, kernel-heavy, I/O heavy, etc... but what about RBMSs? Or even NoSQL? are they doing the kind of things that trigger that slowdown?

    1. Voland's right hand Silver badge

      Re: where do database queries sit in the %CPU loss continuum?

      They are all in a world of hurt.

      1. They write to disk synchronously in order to maintain integrity. There is little or no IO merging.

      2. You talk to them via network (even if it is local - then it is just unix domain socket).

      You are looking at 10s of syscalls per SQL DB query with the corresponding penalty.

      1. JLV

        Re: where do database queries sit in the %CPU loss continuum?

        heh, txs. that's what I was worrying about :(

        you know, I'm starting to wonder if a big beneficiary of this whole mess might not, under the right conditions, end up being Intel.

        Think of it - a lot of potentially somewhat risky/obsolete X86 silicon. Spectre isn't even really Intel/X86-only (or maybe it's Meltdown). This stuff will have to be replaced and who's the dominant vendor? Yes....

      2. bombastic bob Silver badge
        Pint

        Re: where do database queries sit in the %CPU loss continuum?

        "You are looking at 10s of syscalls per SQL DB query with the corresponding penalty."

        I would have hinted at something similar, but I think your summary was good enough, and to the point.

        Beer, sir!

      3. Mike Timbers

        Re: where do database queries sit in the %CPU loss continuum?

        So more money for Larry then?

    2. Destroy All Monsters Silver badge
      Windows

      Re: where do database queries sit in the %CPU loss continuum?

      RBMSs? Or even NoSQL? are they doing the kind of things that trigger that slowdown?

      If it's not slowing down, it's not a database. (It may still not be a database even if there IS slowdown ... right, MongoDB?)

      Processes with little I/O on a machine with few interrupts (not many processes) will be fine. Weather simulations, that sort of stuff.

  3. Marco Fontani

    You might want to consider the Rules Of Optimization Club. They've been written for Perl in mind, but I'm sure they apply in most cases. Repeated / amended below:

    • Don't optimize.
    • Don't optimize without measuring.
    • If your app is running faster than the underlying transport protocol, the optimization is over.
    • One factor at a time.

    Or, in other words, "make it work, make it correct, and only then make it fast". I much rather have slow code which is correct, than very fast code which gives the wrong answer, performs the wrong calculation, or wreaks havoc.

    As to the language of choice, I'm biased as this site's mainly Perl-based… but one surely has to consider the speed and ease at which things can be developed, and not only whether a few milliseconds can be shaved here and there. If gaining speed means having harder to read code, it might not be the best trade-off. It's not always best to optimize for development, rather than for runtime – but it can often be.

    Just my 2c.

    1. Duncan Macdonald

      True - But

      Optimization done carefully can often get better than order of magnitude improvements. Many years ago (on Oracle 7!!!) we had an isolated batch job that had to do calculations based on a 3 day period of a database with several years of data. The tables primary index included the date and time towards the end of the index definition, The initial implementation did selects and joins based on this large table - and it ran like a 1 legged dog - (over 6 hours per customer - and there were over 200 to process).

      Making a private copy of the main table that only had the desired date range (with the same indexes as the main table) and using that instead reduced the run time to under 5 minutes per customer.

      My rules of optimization

      1) Is the system fast enough as it is - if so do not optimize.

      2) Would an affordable hardware improvement make it fast enough - if so then upgrade the hardware and leave the working software alone.

      3) If you decide that optimization is necessary - start by instrumenting the system to find out where the bottlenecks are - there is a good chance that they are not where you thought,

      4) If there are multiple bottlenecks - do not start optimizing - you probably need a system redesign first.

      5) Give the optimization job to the best programmer that you have available - and make sure that the sources have all the optimizations explained well enough that the system can still be supported if the support is outsourced to a third world country.

      1. JLV

        Re: True - But

        3B. write up a decent regression test before 4 or 5.

        Batches are surprisingly easy to test. With identical inputs the original program and optimized programs should match 100% in outputs. Given a sufficiently big and representative sample of production data, you're pretty much assured it will work live, but kick it off to QA anyway.

        Diff-type programs are your friend. If your app is spewing out files, you're mostly there already. If it's populating a database, all you need to do is to dump those tables into files. Multiple lines per row, 1 line per field, sorted in a stable manner and properly formatted can be fed into any diff-type program.

        Takes a lot of the scariness of rewriting stuff away.

        1. Nick Ryan Silver badge

          Re: True - But

          One of the best/worst cases I've come across was where a standard company report took 2-3 days to process and it was lauded as being very important and correct because it took so long to generate and company processes were aligned around this time period. A friend rewrote the process so it took 15 minutes however they refused to believe him that such a thing was possible with such a big, complicated and throughly important report. So for however many occasions they ran both and painstakingly compared them before grudgingly conceded that the 15 minute version was the right way to go.

      2. Jonathan Knight

        Re: True - But

        I've just optimised a vendor supplied ETL process from 3.5 hours to 26 seconds by moving away from a whole world of pain using SQL Server and a .NET application to Perl with a large in memory hash.

        I couldn't have done it without the vendor supplied solution as they knew the internals of their product and all I had to do was write a process that generated identical output. The vendor solution had lots of test points and data visualisation tools to help develop the process which was invaluable in building the solution but are now just performance problems now we've moved into production.

        Jon

    2. Rob D.

      If making something fast is a requirement up front (or more precisely if there is a performance target provided up front) then the sequence is a set of inclusive targets - make it work, make it correct AND make it fast [enough].

      The 'make it work, then make it correct, then make it fast' sequence does occur but it generates the kind of solution I've seen produced in development teams with inadequate requirements and/or no vested interest in what happens after it is released. I.e. they get through inadequate acceptance tests and generate years of follow-on work to try and make it do what it should have done out of the gate.

      FWIW being able to develop and deliver fast enough is also, in my book, a requirement that is just as important.

  4. Anonymous Coward
    Anonymous Coward

    The solution is to ... move to a more powerful and expensive virtual machine to take the extra load.....

    ... AKA disable Hyperthreading.

    1. Destroy All Monsters Silver badge

      Where does this BS come from now?

      1. Anonymous Coward
        Anonymous Coward

        "Where does this BS come from now?"

        About the year 2002 I believe, when running on NT4 / 2000

      2. Anonymous Coward
        Anonymous Coward

        What "BS" ?

        EC2 vCPUs are Hyperthreaded Intel cores (i.e half a shared physical core).

  5. EveryTime

    I'm very much biased to writing most types of code in C, and writing it efficiently.

    But it's likely that loosely-written code, with lots-o-overhead, has the same number of syscalls while executing more instructions and thus will be proportional less impacted by an increased syscall overhead.

  6. Anonymous Coward
    Anonymous Coward

    Love the picture at the top of the article !

  7. Haku
    Coat

    BBC news headline "Meltdown and Spectre: All Mac devices affected says Apple"

    Does that mean Apples are rotten to the core?

    .

    .

    .

    Ow, stop hitting me!

    1. FuzzyWuzzys
      Happy

      Re: BBC news headline "Meltdown and Spectre: All Mac devices affected says Apple"

      It means maybe Apple possibly having a tinge of regret at moving to Intel 15 years ago?

      1. Korev Silver badge
        Coat

        Re: BBC news headline "Meltdown and Spectre: All Mac devices affected says Apple"

        If only they had the Power to move to another chip

    2. Anonymous Coward
      Anonymous Coward

      Re: BBC news headline "Meltdown and Spectre: All Mac devices affected says Apple"

      Also iPhone and iPad are affected by Meltdown, which is REALLY interesting.

      They were obviously going to be affected by Spectre like most others.

      Also noteworthy is how Android has escaped,. High precision timers are needed to exploit Spectre and android doesn't doesn't expose these to userland apparently.

  8. AndrueC Silver badge
    Boffin

    So it only impacts processes that do a lot of disk and network I/O?

    Bugger server problems. That's gonna take a bite out of my productivity - Visual Studio isn't exactly greased lightning at the best of times :-/

    I wonder if it can be mitigated by marking folders as compressed(*)? NTFS compression doesn't require much CPU so perhaps the time lost there can be offset by less time spent context switching in and out of the Kernel. That used to be the case with slower hard disk drives in computers with good CPUs.

    And I think that if you copy a compressed file between Windows computers where both copies are going to end up compressed it is transferred in its compressed state thereby saving time spent on the NIC.

    Maybe. Might make for an interesting study.

    (*)Though I think that SQL Server at least won't let you compress a database file.

    1. Korev Silver badge
      Boffin

      This is one of the reasons that HPC types like HDF5; it's quicker to read data from disc and decompress it than read the uncompressed data from disc.

  9. Stuart 22

    My brain is hurting ...

    Despite over 50 years in the trade starting with Deuce machine code - getting my head round and combatting the issues is kinda hard. To be honest - I'm lost.

    Perhaps those that are not could comment on whether the fixes applied so far are characterised at 'plugging the hole' at all costs and as quickly as possible, oh and perhaps being ultra careful.

    In other words down the road may there may be room to optimise the fixes to restore some of the lost performance?

    And any speculation on the time it may take for the chippies to come up with the necessary redesign? and who might be lightest on their feet?

    The irony is the winners in this catastrophe will be the worst offending chippies as a consequence of the increased demand for more computer power to fill the performance gap!

    1. Warm Braw

      Re: My brain is hurting ...

      I'm not an expert, so please correct me as required, but I think this one is going to be with us for some time because:

      1/ There is unlikely to be a better solution to the Meltdown problem until the silicon is redesigned.

      2/ The mitigations for Spectre are still not all in and are unlikely to be comprehensive until the silicon is redesigned.

      3/ Redesigning the silicon may in itself result in a loss of performance.

      In the longer term, things like memory encryption and secure zones for storing critical secrets may offer a way out, but you can't retrofit them to hardware that's already out there.

    2. hmv

      Re: My brain is hurting ...

      Yes. My internal communications regarding these vulnerabilities has included the disclaimer that there's lots of incorrect information out there, and that _my_ information isn't immune to being wrong.

  10. ee_cc

    Can’t we just move to microkernels?

    For what I understood of Meltdown and it’s fix, we’re just a hair away from putting kernels into their own separate process space — and considering the performance impact, it’s about the same of running on one.

    Linux is 20 years old, and based on an old design anyway... ;)

  11. Anonymous Coward
    Anonymous Coward

    Non-functional requirements

    This may get interesting legally for the largest cloud 'code as a service' type providers for any existing production services that are impacted by the fixes that already had tight agreed service levels with non-functional requirements defined that specified tight sub-second transactional performance timings. Some of our clients have internal operational performance targets for their on-premise (not cloud) back-end middleware web services saying that their application call timings must round up to the nearest ten milliseconds or so but they are completely variable depending on shared resource utilisation so measured as averages over longer time periods. The same principle should apply to cloud provided services as well as on-premise stuff. Although it may be difficult (read nearly impossible) to model individual service loads across dozens of shared services hosted on the same clusters, taken as a whole across the whole virtual server farm as hourly or monthly average timings and CPU utilisation across the cloud estate are slowed down (or CPU ramped up) by the Spectre fixes by say more than a few percent this might justify some major contract renegotiations? This could be especially true where per-core software license models are involved if a few more cores are required to scale back to previous capacity utilisation levels? Unless there is a generic get-out-of-jail clause in their small print that allows for performance impacting changes due to parent architecture change dependencies or something similar, I expect only the most aggressive (near capacity, hard to scale) services may want to try and renegotiate deals this year. Any HPC cloud users out there care to weigh in on this?

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like