back to article Zombie Moore's Law shows hardware is eating software

After being pronounced dead this past February - in Nature, no less - Moore’s Law seems to be having a very weird afterlife. Within the space of the last thirty days we've seen: Intel announce some next-generation CPUs that aren’t very much faster than the last generation of CPUs; Intel delay, again, the release of some of …

  1. Anonymous Coward
    Anonymous Coward

    You can already write code to design a chip

    VHDL has been used to do this for some time, and it can be fed into simulators (i.e. interpreted) or compilers. It can't do everything, and doesn't produce the most efficient design, but for doing stuff that's more run of the mill than designing an A10 or 24 core Hololens chip, it works well at a much more reasonable budget.

    The expensive part of a chip, especially in a leading edge project, might not even be the design team but rather the mask set, the cost of which can now exceed $10 million.

    1. Paul Crawford Silver badge

      Re: You can already write code to design a chip

      True, but then VHDL sucks donkey balls when it comes to ease of use, cost and helpfulness of tool chains, and generally getting stuff working quickly. Its a dense and very pedantic language originally build by US DoD committee to standardise the building of ASICs.

      It might be great for those who spend a lot of time using it, and obviously it (along with others like Verilog and simpler ones like ABLE, etc) are based on parallelism which is natural to hardware but not to procedural languages, but it is so far from something that you could easily get casual interest students using.

    2. energystar
      Alien

      Re: You can already write code to design a chip

      Unified determinism carries their own limits. Von Neumann topology is just one among lots.

    3. Anonymous Coward
      Anonymous Coward

      Re: You can already write code to design a chip

      The vast majority of ASICs, including processors based on ARM architecture, will be designed using VHDL or Verilog.

      There are also tools available on the (EDA) market that will create HDL for a processor based on a specified instruction set requirement.

  2. Sparks_

    Moore's law is misquoted too much. The original observation is about price per computation power over time, which still seems to be holding well. The misquotation is usually price per gate, based on silicon area, or some other manufacturing derived metric. But the original macroscopic "whole compute" level figure is still tracking well. Note that "compute" is both hardware, software, or next-gen-whateverware, whatever that may be.

    1. Tony Haines

      //The original observation is about price per computation power over time, which still seems to be holding well. The misquotation is usually price per gate, based on silicon area, or some other manufacturing derived metric.//

      Actually I think you have that wrong. The original observation was about the rate of increase of components per integrated circuit; this was developed and modified over time to become the observation about computation cost.

      At least, if you trust the Wikipedia article, and my memory of other sources.

      https://en.wikipedia.org/wiki/Moore%27s_law

      1. Anonymous Coward
        Anonymous Coward

        https://drive.google.com/file/d/0By83v5TWkGjvQkpBcXJKT1I1TTA/view

        You're both wrong ;)

        1. Tony Haines

          "You're both wrong"

          Actually I'd say it looks like we're both right. The original paper you link to mentions falling cost and rising component count - right in the subheading. But the paper overall I'd say emphasizes miniaturisation and increasing component density as the primary factor.

  3. Paul Hargreaves
    WTF?

    I guess the whole 'Software Defined' (storage, networking) isn't happening then?

    Why bother with custom ASICs when you can just use off-the-shelf hardware that is plenty fast enough for the job?

    Hololens as an example vs. Atom? Really? A Casio watch is more powerful than an Atom. But, also, for single task workloads a custom processor can be useful since it can reduce power and cost by only having the components needed.

    > Apple’s new A10 chip, powering iPhone 7, is as one of the fastest CPUs ever.

    'is as'?

    Also, how inaccurate. Lets compare an A10 vs. a modern Intel CPU.

    http://browser.primatelabs.com/processor-benchmarks#2 vs. http://browser.primatelabs.com/ios-benchmarks

    The A10 is slower on single core workloads by a large margin, and the multicore result from the A10 is around the same result as the single core result on the Intel. Now turn on multicore on the Intel...

    To get the (around) 6x performance increase on the A10 you'd need to bolt another 6 of them together, consuming considerably more space.

    Don't mistake me. A10 is good for a mobile CPU, but no where near the 'fastest cpu'.

    1. Anonymous Coward
      Anonymous Coward

      "I guess the whole 'Software Defined' (storage, networking) isn't happening then?

      Why bother with custom ASICs when you can just use off-the-shelf hardware that is plenty fast enough for the job?"

      That's not really the point of software defined networking (not sure about storage, not my area). The commodity hardware you use for a software defined network still has the custom ASICs needed for fast packet forwarding, you've just taken the high level functionality (such as building and maintaining the routing tables that govern this forwarding) and abstracted it away to a separate network controller box which does its work in software. So you end up paying your "commodity" network equiment manufacturer $$$ for the fast switching hardware, but only pay $$ for the control hardware/software elsewhere - as opposed to $$$$$ for hardware and $$$$$ for software if you buy it as an integrated unit with full proprietary lock down from the likes of Cisco. That's the theory, at least. You're certainly not going to be replacing your network switch with an off the shelf x86 box with a bunch of interface cards, if that's what you were thinking (well, it might work for a lab setup, but for anything serious? No... just... no).

      The rest of your post is fair enough though, so have an upvote on me.

  4. mythicalduck

    Also there have been FPGA boards for the Pi before and the BeagleBoard. ModMyPi used to sell the BeagleBoard one, but it no longer seems to be listed and I can't remember what it's name was now.

    I nearly bought a BB and the FPGA board for a project, but it seemed that you could only bash a byte at a time between the two processors. If I ever find an FPGA dev board that shares RAM with a "regular" CPU (ARM/Intel), I'd love it.

    1. short

      Sounds like you want a Zynq-based board, like a zedboard ?

      Dual ARM Cortex-A9 and a bucketload of FPGA fabric, all nicely coupled and with a tolerable toolchain for both sides.

      Or, if you don't need that much compute, just compile up a microblaze CPU or two in a standard FPGA. The gates that you need for a 32-bitter aren't expensive any more. You;ll not get the brute speed of a pair of GHz Cortex-A9s, but it's 400 MIPS of a tolerable RISC machine.

      You can then bolt as much of your own logic as you like, very, very close to the CPU(s).

      (Other FPGA architectures exist, I'm just in a Xilinx mode at the moment)

    2. Dwarf

      Papilio.cc

      Are you thinking of the gadget factory's papilio FPGA boards ?

    3. 8Ace

      Build what you need

      "but it seemed that you could only bash a byte at a time between the two processors. If I ever find an FPGA dev board that shares RAM with a "regular" CPU (ARM/Intel), I'd love it."

      FPGA's don't have "interfaces" per se. The point of an FPGA is that you have hardware inside and you create your own specific hardware from that. So if you want to use an FPGA with an ARM cpu, decide what interface you want to use and then implement that in the FPGA. If you want to use shared RAM then fine, most FPGA's inlcude memory so you then need to implement your shared access in the FPGA hardware and handle address collisions etc. For those coming from a software environment, VHDL and Verilog can be a strange concept. It's not a program, unless specified there is no flow, and everything can have the potential to happen simultaneously.

    4. Steve Todd

      There's also Altera's version in the form of the Cyclone V SE. You can share memory or talk to the CPU via the AMBA interface.

      There's even a cheap(ish) Dev board in the form of the DE0 Nano SOC.

      http://www.terasic.com.tw/cgi-bin/page/archive.pl?Language=English&No=941

      1. mythicalduck

        There's also Altera's version in the form of the Cyclone V SE. You can share memory or talk to the CPU via the AMBA interface.

        There's even a cheap(ish) Dev board in the form of the DE0 Nano SOC

        Hmmm, I spent ages looking at this board, but the block diagram (link below) shows the the RAM is only available on the HPS side and couldn't find any information regarding accessing the RAM from the FPGA side. It was a while ago though.

        http://www.terasic.com.tw/cgi-bin/page/archive.pl?Language=English&CategoryNo=163&No=941&PartNo=2

        I've just redownloaded the user manual, and it lists the DDR3 under "Peripherals connected to Hard Processor System", but what I clearly didn't notice last time, is that the table also lists a bunch of FPGA pins, so that might actually be the board I'm looking for. Thanks :)

    5. Anonymous Coward
      Anonymous Coward

      The TE0722 might be what youre looking for - a low-end Zynq, still with dual cortex A9

      http://www.trenz-electronic.de/products/fpga-boards/trenz-electronic/te0722-zynq.html

      This doesn't have DDR so not enough RAM to run Linux, but FreeRTOS should fit in the 256KB of on-chip RAM (which can be stretched by a couple of tricks like putting FPGA block RAMs on the processor memory bus, and accessing code from flash via the 512KB cache).

      To run Linux, a Parallella board is still a nice low-cost option.

  5. John Smith 19 Gold badge
    Go

    What's really changed is the development tools

    From a time when most custom chips were laid out with a set a coloured pencils and graph paper.

    I'd suggest access to good tools was what made ARM doable by a very small team.

    What a same size team could do today would be much larger.

    But if this new hardware has it's own instruction set you'll have to generate a code generator for you're favorite tool chain (and languages) to support it.

    The fact you can do this (beyond knocking up some in house assembler) may be one of Unix's lasting contributions.

    1. A Non e-mouse Silver badge

      Re: What's really changed is the development tools

      Er, you have read the history of the ARM processor? From the Wikipedia page:

      "A visit to the Western Design Center in Phoenix, where the 6502 was being updated by what was effectively a single-person company, showed Acorn engineers Steve Furber and Sophie Wilson they did not need massive resources and state-of-the-art research and development facilities."

      1. Dazed and Confused

        Re: What's really changed is the development tools

        showed Acorn engineers Steve Furber and Sophie Wilson they did not need massive resources and state-of-the-art research and development facilities."

        But they certainly set about finding the best designers, they head hunted much the best chip designer from where I worked at the time.

  6. Destroy All Monsters Silver badge
    Windows

    Lots of handwaving and metaphoring in this article

    Implementing functions using dedicated circuits means the functions can be computed polynomially faster than if they were done by a state machine: TRUE.

    Implementing dedicated state machines in hardware means the state machine can run polynomially faster than if a software-defined state machine were implemented by a generic hardware-defined state machine: TRUE

    These optimizations can now be done as tools and component libraries reach maturity and small fabrication runs become economically viable for a bespoke solution: TRUE.

    Go for it!

    (I remember having lots of fun doing a multiplier and various other circuits on an FPGA using a user-friendly graphical editor back in 1992 on a small host system for exercises. That was also when the first series of articles about "software-defined hardware" appeared. A prime number generator on a chip was being talked about in BYTE as I remember...)

    1. Anonymous Coward
      Anonymous Coward

      Re: Lots of handwaving and metaphoring in this article

      It isn't a new trend. Back in 1987 I was working on the functional design of an ASIC to be closely coupled to a SPARC processor to offload arithmetic functions that were too time consuming to do in software. Going back further to 1982 I designed hardware that used a maths co-processor coupled to a Z80 processor which need more computation horsepower to do curve fitting.

      Also I "think" the ARM core was available in ASIC libraries from at least the early nineties. Since then there have been numerous implementations of hardware assisted ARM based systems, so this really isn't anything new.

  7. short

    So, about Intel having bought Altera...

    One might have thought that Intel buying a massive FPGA company, giving them access to programmable hardware and a toolchain to drive it, might have been worth a mention in this article?

    Some of Intel's CPUs and some FPGA fabric on the same die (or maybe in the same package, along with some stacked RAM), all on Intel's spiffy process, should be interesting. Expensive, no doubt, but interesting.

  8. IJD

    ASIC vs FPGA vs CPU

    There's a continuous space of power vs. flexibility -- to do the same amount of processing and in the same process node, at one end a fixed-function ASIC is by far the lowest power and die size but inflexible, an FPGA is more flexible but higher power and die size (typically ~10x), a CPU is completely flexible but much higher power and die size again (typically ~100x).

    Cost would follow the same trend if volumes were similar but they're not, given the very high NRE cost of the latest process nodes tilts the costs towards FPGAs and CPUs unless your volumes are very high. Saying that doing the same job would cost $100 in a CPU or $10 in an FPGA or $1 in an ASIC is true, if you want tens of millions of them -- bear in mind that the total NRE (design and mask) for even a small ASIC in the latest process nodes is at least tens of millions of dollars, or more than a hundred million for a more complex one, so you need to sell a lot of chips to get this back.

    So for most cases CPUs or FPGAs make more sense, advanced process node ASICs (or custom CPUs with custom hardware accelerators) only really make sense where the need to get lower power is absolutely imperative and the TAM justifies the cost. One interesting trend driven from this is that such designs move back into the companies who make the end product like Apple (vertical integration), because getting lower power (or higher speed) by doing your own 10nm chip makes sense if you can clean up the market selling a $600 product with $300 gross margin, but not if you're selling a $60 chip with $30 gross margin at the same volume.

    1. Mage Silver badge

      Re: ASIC vs FPGA vs CPU

      FPGA is used for low volume or where per unit cost or power consumption is irrelevant. ASICs are almost always modelled as an FPGA first.

      Your Verilog / VHDL can be run as a simulation with monitoring on a PC/Workstation before downloading to the actual FPGA. You describe FPGA, you are not having a software defined Hardware, but re-configurable hardware modules and look up tables to implement a hardware design. So of course it's massively parallel, it's not a CPU unless that's in your HW description/Design.

      I've only used the Xilinx FPGAs and tools though, not Altera.

      CPUs and FPGA/ASICs are complementary. Some things only need hardware, no CPU. Some things are more easily realised as programs on a CPU, hence CPU plus FPGA. A Port or shared RAM can be used, or hardware CPU core on FPGA, or FPGA defined to create a simple CPU (6502, Z80, PIC), or an SoC with CPU cores, an FPGA area and conventional SoC i/o GPU. This allows field upgrades, wereas the ASIC or conventional SoC can only have the CPU / GPU firmware/programs changed.

      1. Mage Silver badge

        Re: ASIC vs FPGA vs CPU Development

        Continued ...

        Verilog and VHDL are NOT programming languages, they are Hardware description languages that are then translated into an FPGA configuration file or an ASIC specification. The same source can produce either and optionally include a CPU design (even if the FPGA has no CPU core), which is then separately provided with microcode, firmware, programs etc. The ASIC version would have a real CPU core, if one was defined.

    2. JeffyPoooh
      Pint

      Re: ASIC vs FPGA vs CPU

      Development Environment.

      Menu.

      Save As...

      ...Software

      ...Hardware

      We ran into this years ago. Different rules for hardware projects and software projects. I told them that it's a spectrum, and they blinked in mindless stupor.

  9. Bronek Kozicki

    The whole problem with Van Neumann machine

    ... is power required for accessing the memory where program and data are stored. Compared to the power budged of actual computations, it used to be small in the previous century. No more - currently it is orders of magnitude higher than power used for actual computation. Additionally, the latency getting the data out of memory has not much improved in the past decades, compared to increasing CPU computing power. Even worse, since increasing parallelism had become the only viable choice for increased software speed, the synchronization of data in memory (that is, completed memory writes and cache synchronization between cores) has became critical to computing performance. There is little that can be done while we are still saddled with inefficient DRAM. However, FPGAs or ASICs also need to read and store data somewhere - even if the program is hard-wired. Of course for small programs there is nothing wrong with small amounts of SRAM, but things are different if you look to deploy these devices into wider environment, with large amounts of data flowing around. Which means they will hit memory limit too (actually I am pretty certain they are hitting it already). When much faster and cheap (both in terms of money and power budget) alternatives to DRAM become commercially available, the tables might turn again.

    Still, it pays off to (and will continue to) know both hardware and software side of programming, so kudos for the article.

    1. Anonymous Coward
      Anonymous Coward

      Re: The whole problem with Van Neumann machine

      Part of the problem is simply the speed of electricity. Computing has gotten so fast that in a single CPU cycle an electron can only travel, say, a few inches. Not to mention those CPUs get pretty hot when they're at full throttle (again, a sheer physical thing that's architecture-agnostic at this point). Which means you have conflicting issues. You need to get the memory close to the CPU to reduce the travel time, but that heat matter means they can't be too close, either.

  10. StaudN
    WTF?

    I Call BS

    So future "Patch Tuesdays" will involve a global shipment of chips? ... Surely the whole point of general purpose CPUs is that the minutia of the actual behaviour of the systems which they run is abstracted away from the hardware and can be updated/modified as required.

    I call BS on this whole concept: it's fine for specialised stuff as it's currently being used for, but will we ever see a "MS Word" chip? Hell no: that line of thought misses the entire point of having software.

    1. Steve Todd

      Re: I Call BS

      You're not getting it. Software will not be going away. What will be happening is that progressively more work will be offloaded to hardware, at least some of which can be soft-configured (which is the whole point of FPGAs). "Patch Tuesday" will contain updated soft configurations as well as traditional code. There's also the matter of the driver stack that connects the software to the hardware.

      1. Bronek Kozicki

        Re: I Call BS

        "Patch Tuesday" will contain updated soft configurations

        It already does. Also available for Linux.

      2. StaudN

        Re: I Call BS

        Well that's not "hardware eating software" then... That's just more software which happens to require a special chip to run. Meh.

      3. LionelB Silver badge

        Re: I Call BS

        You're not getting it. ...

        Absolutely. We've already got FPUs and GPUs. My work involves massive number crunching, generally via low-level machine-optimised libraries like the BLAS, FFTW, LAPACK, etc. While there are ongoing projects to port these to GPUs, I would love to see some of this implemented in hardware on dedicated co-processors.

      4. Anonymous Coward
        Anonymous Coward

        Re: I Call BS

        Well, new patches for AMD/Radeon and Nvidia these days are exactly this.

        Lots of work now gets offloaded to the massively parallel hardware in the GPU's (that are several orders of magnitude larger than any current CPU in any metric taken - sadly power consumption and heat output also), exactly with the intent purpose to free the CPU to run other code, or to allow the processing in a level no modern CPU could handle in the same time-frame.

        And, for instance, video rendering and encoding was entirely coded for CPU's, but now some graphics cards can handle the encoding several times faster and offload the work for storage, freeing the CPU.

  11. jason 7

    Nothing wrong with the chips.

    It's shitty lazy code that's the problem.

    Instead of throwing more hardware at the problem, maybe clean up and optimise the code a little?

    1. jzl

      Re: Nothing wrong with the chips.

      It's shitty lazy code that's the problem.

      No, it's not that simple. Code is a product. It is paid for with money.

      Modern code is produced - feature for feature - for a fraction of the price of code 30 years ago. The reason for this is that development tools have become unbelievably productive. There's a trade-off in terms of performance on the underlying hardware, sure, but the way to improve raw metal performance of the code would be to forgo some of the tools that make developers so productive.

      Besides, although it's widely said it's not completely true. Modern high FPS animated UIs are intrinsically compute intensive, as are many cloud based data workloads. Web browsers, too, are surprisingly compute heavy - layout and render of modern HTML is non-trivial, and that's even without taking Javascript into consideration.

      Not to mention that there's a continual drive to improve tooling, particularly at the language level. Look at Javascript: modern browsers execute it orders of magnitude more efficiently than the very first Javascript enabled browsers.

      1. Anonymous Coward
        Anonymous Coward

        Re: Nothing wrong with the chips.

        "to forgo some of the tools that make developers so productive."

        You mean make shit developers just about employable. All a proper dev needs to do his job is

        - Compiler/interpreter

        - Editor

        - Debugger/tracer

        - Profiler

        - Disassembler (optional)

        All of the above have been around since at least the 70s, most a lot earlier so they don't all need to be mashed together in an IDE with lots of cutesy graphics that requires 100 meg of memory just to boot either.

        1. Destroy All Monsters Silver badge
          Windows

          Re: Nothing wrong with the chips.

          The great thing is you actually forget about the most important thing:

          A language adapted to the problem sapce.

          > since at least the 70s

          Yeah no oldsy. Try debugging today's applications in 64 KB mainframe RAM.

          1. Anonymous Coward
            Anonymous Coward

            Re: Nothing wrong with the chips.

            "A language adapted to the problem sapce."

            Yes, if only people actually took that approach. 30 years ago it was C is the answer to everything. 20 years ago it was C++, 10 years ago it was Java. Now every spotty dev just out of college thinks all you need to learn is Javascript or Python.

            "Yeah no oldsy. Try debugging today's applications in 64 KB mainframe RAM."

            Whooooosh......

            Why do you think I mentioned the memory footprint of modern IDEs? *sigh*

          2. Bakana

            Re: Nothing wrong with the chips.

            Actually, try getting the same amount of Work out of today's "Modern" languages as that 64KB Mainframe used to perform.

            In point of fact, some of the applications developed for those 64KB Mainframes are still running today because A) they still work & B) No one wants to spend the Millions of dollars and years of effort even Trying to replicate that software in a "Modern" language would cost.

            I've worked on a couple of those "Modernization" projects and the Performance of the resulting software was godawful. Even when it "Worked", it usually took at least twice as long to perform the same amount of work and had "issues" that the developers Promised would be Fixed "any day now".

            Then there were the "Enhancements" and "Extra Features" that the project managers just couldn't resist adding in. Suddenly, something that performed a critical job reasonably well did a whole Bunch of things, all of them Poorly.

            Of course, in one particular instance, the Poor Performance turned out to be Designed In.

            I didn't understand Why some of the design choices were made for the project until I discovered that the Consulting Firm doing the Development also had a contract to Run the finished application as a Service Bureau which would get paid for CPU time and Database Storage on a Cost Plus basis.

            Suddenly, the poor performance and lousy data design made a Lot more sense if you knew it would be driving the company's Profits.

        2. Steve Todd

          Re: Nothing wrong with the chips.

          @boltar - so you're coding exclusively in assembler and hitting the hardware directly are you? It's the compiler, hardware abstraction layers and library code that slows a modern program down compared to days of yor. All of those are good things in terms of productivity.

          Even ripping those out you still have the basic problem that a CPU is designed to execute a stream of instructions, one at a time. There are assorted techniques used to make this as fast as possible, but it's still effectively a sequential process. Hardware is good at tasks that can be either pipelined or run in parallel (or both). If the workload is suitable then hardware can implement it thousands of times faster than the best written code.

          1. Anonymous Coward
            Anonymous Coward

            Re: Nothing wrong with the chips.

            "@boltar - so you're coding exclusively in assembler and hitting the hardware directly are you? "

            Thats right, thats why I said compiler/interpreter on my THIRD line! Learn to read. And if the pavlovian trigger for you was "disassembler" and you don't know why you might need one for other languages then I suggest you go and educate yourself as to why.

          2. yoganmahew

            Re: Nothing wrong with the chips.

            @Steve Todd

            "so you're coding exclusively in assembler and hitting the hardware directly are you? "

            Yes, yes I am. Even if boltar isn't...

            Even there, though, new instructions get added that perform at hardware/firmware level what used to be a routine, e.g. the checksum instruction.

            "the basic problem that a CPU is designed to execute a stream of instructions, one at a time"

            No, that's not how they're designed anymore.

        3. jzl

          Re: Nothing wrong with the chips.

          Tools like node.js? Tools like unity? Tools like NHibernate? Tools like ActiveX? Tools like JQuery? Tools like Entity Framework?

          And they may not need an IDE with cutesy graphics, but software development isn't a contest in theoretical purity, it's a race for productivity.

          A modern "cutesy" IDE contains many features which make development very much faster and more productive.

          I speak from direct, long standing and - if I may say so - very successful professional experience.

        4. Brewster's Angle Grinder Silver badge

          Re: Nothing wrong with the chips.

          All a "proper dev" needs is switch. The machine code is inputted, bit by bit, by toggling the switch.

          1. Crisp

            Re: Nothing wrong with the chips.

            All a "proper dev" needs is a magnetized needle and a steady hand.

            (XKCD 378)

        5. JLV

          Re: Nothing wrong with the chips.

          >All a proper dev needs to do his job is

          Think a bit. My domains of competence are Python and SQL, along with some more proprietary stuff.

          Any time that I want to write fast Python code, I will often think of using dictionaries, aka hashmaps. Now, they are not applicable 100% of the time, but they are very, very fast. The, slow, Python interpreter mostly gets out of the way and calls highly optimized C that's been tuned for years and years. AFAIK the underlying implementation is a B-Tree type algo. A big part of writing fast Python code is knowing which built-in data structures to use to tap into the fast C stuff underneath.

          Now, instead of being a lazy shit dev, I could take out my editor (I dislike IDEs, but disagree with you that that shows skill of some sort) and wrangle some C hashmap code myself. Even if I knew C well enough, would my code be as fast as that evolved over years by folk very much smarter than me? I think not.

          You're gonna say "Lazy turd, using a scripting language". News to you, modern system languages are evolving towards having more or less built-in maps - Rust, Go, Swift have them. I don't mind system languages, I loved a quick dip into C a while back. But there is a lot of value in providing building blocks on top of even system languages. Who wants to implement a linked list? Who wants to use code with gratuitously hand-written linked lists, unless there is a very very good reason?

          That's the power, and costs, of abstractions. I now how, and when, to use a hashmap, but I have little idea of how it is put together. I do know that Python speeds can go up by orders of magnitude when you know the little tricks.

          Let's not even get into the abstractions involved with working on top of relational database. A good SQL coder will write stuff that is infinitely faster than a noob. Both have a limited idea, or interest, in knowing the rdbms internals, though the experienced coder will know about indices, NOT IN slowness, full table scans, etc...

          Every so often people bemoan that software engineering (hah!) is nothing like say mechanical engineering. Now, by your metrics, does that mean the automotive engineer needs to design his own subcomponents (bolts, drive belts, brakes), each and every time? That would do wonders for both car costs and car quality, would it not?

          No, we expect re-use there as well and don't call car engineers out for "just" assembling car components together.

      2. Doctor Syntax Silver badge

        Re: Nothing wrong with the chips.

        "The reason for this is that development tools have become unbelievably productive."

        Which enables features to be added easily. Making decisions as to which features should be included is extra work. If it's easier to just put them in anyway you get bloat and its associated performance costs.

        "Besides, although it's widely said it's not completely true. Modern high FPS animated UIs are intrinsically compute intensive, as are many cloud based data workloads. Web browsers, too, are surprisingly compute heavy - layout and render of modern HTML is non-trivial, and that's even without taking Javascript into consideration."

        In other words it's Shiny that's the problem.

        1. jzl

          Re: Nothing wrong with the chips.

          In other words it's Shiny that's the problem.

          I'm involved in a large scale financial enterprise system (in-house for a large investment bank). It consists of a user-configurable highly responsive UI that allows rapid drilldown of massive datasets, configurable side-by-side charting and customisable dashboards.

          It's fast, but it needs modern hardware.

          None of it is there for "shiny". I'm not paid for shiny. It's there to provide subtle, powerful analysis of complex data. The data visualisation available through modern UI capabilities is not something I could code by hand from scratch, and it's not something I could shove through a 486-DX.

          And it's certainly not something a team of our size (four developers) could write without access to some powerful but high level libraries.

      3. jason 7

        Re: Nothing wrong with the chips.

        "No, it's not that simple. Code is a product. It is paid for with money."

        So after all that...it's still the code that's the problem!

    2. ITnoob

      Re: Nothing wrong with the chips.

      Is that you Linus?

  12. Anonymous Coward
    Anonymous Coward

    The software is still there

    Its just called "firmware" or "microcode". Good luck implementing complex graphics algorithms using hard wired TTL logic. You'd need a chip the size of a bus.

    1. 8Ace

      Re: The software is still there

      That's why the largest current FPGA's and VLSI chips have billions of transistors, you "connect them up". You are still using logic primitives in a lot of cases but they are soft configured within the device.

      1. Anonymous Coward
        Anonymous Coward

        Re: The software is still there

        "That's why the largest current FPGA's and VLSI chips have billions of transistors, you "connect them up"."

        And at the end of it they generally solve ONE problem. Now imagine a hardwired chip that had EVERY modern graphics algorithm built into it. Seeing my point?

        1. 8Ace

          Re: The software is still there

          "And at the end of it they generally solve ONE problem. Now imagine a hardwired chip that had EVERY modern graphics algorithm built into it. Seeing my point?"

          Jeez calm down, if you are a developer your output isn't going to be replaced by an FPGA anytime soon. The story is saying that hardware is more efficient in a lot of cases and these will increase. Nobody is saying that hardware can replace software, any more that they are saying that software is possible without hardware.

          1. Anonymous Coward
            Anonymous Coward

            Re: The software is still there

            "Jeez calm down, if you are a developer your output isn't going to be replaced by an FPGA anytime soon."

            Calm down? Wtf, I was just making a point. Why are some people so wet they see any disagreement as some kind of confrontation? And my original point was that to reproduce the functions of a modern graphics card using hard wired logic (no microcode) would require a massive die.

    2. Steve Todd

      Re: The software is still there

      Talk to nVidia or AMD. Their GPUs are a mix of dedicated hardware and stream processors. No one said hardware could do everything, but there's a lot of performance to be had by offloading the right bits of a task to it.

      1. Anonymous Coward
        Anonymous Coward

        Re: The software is still there

        "Talk to nVidia or AMD. Their GPUs are a mix of dedicated hardware and stream processors."

        Any complex modern computer processor requires microcode to operate. Microcode is software.

        1. 8Ace

          Re: The software is still there

          "Any complex modern computer processor requires microcode to operate. Microcode is software."

          ARM designs are supplied to licensee's with HDL descriptions of intsruction decode, these are then implemented directly in hardware

        2. John Savard

          Re: The software is still there

          The 386 used microcode, like a System/360 Model 65. The 486, though, was hardwired, like a System/360 Model 75.

          It is true, though, that even the latest x86 processors use microcode to handle a few complex instructions, as do the latest System z processors. Most RISC chips, though, eschew instructions so complex as to make microcode a necessity, even though they still do things like floating-point arithmetic that take multiple cycles.

          1. oldcoder

            Re: The software is still there

            The x86 uses a much more complex of microcode.

            The x86 instructions are first translated into RISC instruction strem and optimized...

            LOTS of microcode there.

        3. Long John Brass
          Alien

          Re: The software is still there

          @Boltar

          Any complex modern computer processor requires microcode to operate. Microcode is software.

          It kind of is and it kind of isn't

          Microcode is a series of bit patterns that enable/disable/connects the various chunks of logic blocks in the CPU "fabric" although fabric is probably not the right word

          So while its updateable microcode is not really what you would consider a program or software

          Caveat: I'm not a CPU designer but I play one on the internet :)

  13. Wommit

    @Boltar7

    Who pissed into your cornflakes this morning?

  14. kars1997

    This is not going to save Moore's law

    I don't buy the premise of this story.

    Yes, doing stuff in specialized hardware gives you a 200x, or a 1000x-boost over doing it in software on a general-purpose chip.

    But that's a one-time boost. At the end of the day, the performance of that hardware is still going to be limited by its process density.

    So all you're really doing is delaying the point at which you can no longer improve performance, even in hardware, at the cost of adding extra chippery for various functions.

    1. 8Ace

      Re: This is not going to save Moore's law

      No, process density is only one factor, there are others such as the process itself (Lithography etc.) and the device type being implemented. Currently the full density can't be exploited due to these other limitations, however there are ways round this. For example the FinFET devices now being used have advantages over previous devices so the technology continues to move forward. All the way from bipolar to CMOS, SOI etc. the devices have been improved. Same applies to the process and the process density. Engineering is problem solving after all.

  15. Alan J. Wylie

    The wheel of reincarnation

    http://www.catb.org/~esr/jargon/html/W/wheel-of-reincarnation.html

    [coined in a paper by T.H. Myer and I.E. Sutherland On the Design of Display Processors, Comm. ACM, Vol. 11, no. 6, June 1968)] Term used to refer to a well-known effect whereby function in a computing system family is migrated out to special-purpose peripheral hardware for speed, then the peripheral evolves toward more computing power as it does its job, then somebody notices that it is inefficient to support two asymmetrical processors in the architecture and folds the function back into the main CPU, at which point the cycle begins again.

    Several iterations of this cycle have been observed in graphics-processor design, and at least one or two in communications and floating-point processors. Also known as the Wheel of Life, the Wheel of Samsara, and other variations of the basic Hindu/Buddhist theological idea. See also blitter.

  16. inmypjs Silver badge

    "we’re seeing a migration away from software and into hardware"

    ", wringing every last bit of capacity out of the transistor."

    WTF are you talking about?

    A transistor in a circuit dedicated to video decompression for example sits doing nothing when you are not decompressing video. A transistor sitting idle most of the time is hardly squeezing every last bit of anything out of it.

    Dedicated circuits can be faster and use less energy. They cost to manufacture and development is expensive. High volume applications (to amortise development costs) with low energy requirements like smart phones are an obvious candidate especially high end smart phones where production cost is less of an obstacle.

    1. Charles 9

      Re: "we’re seeing a migration away from software and into hardware"

      "A transistor in a circuit dedicated to video decompression for example sits doing nothing when you are not decompressing video."

      But if the times when it's NOT decompressing video (or compositing a UI or whatever task it is dedicated to perform) are few and far between, then odds are you get a net benefit for it. That's part of what's happening now. They're taking a look at what things CPUs have to do all the time and offloading them so that the CPU has more time for more generalized workloads, much like having a specialist for handling particular jobs that happen to come up quite frequently.

  17. Anonymous Coward
    Anonymous Coward

    This is not a new trend...

    We didn't always have a math co-processor in the CPU. Intel and AMD both added extensions into their processors, including ones to increase multimedia performance. Outside of computers we have multiple ASICs in everything from DVD players, receivers, to cars

    One could even argue that software is driving a need for more logic in hardware.

    1. Charles 9

      Re: This is not a new trend...

      The argument being that you're starting to see similar kinds of software being used all the time. If you have a particular job being done again and again, it becomes practical to push this function into an ASIC to (a) speed up the turnaround on that process, and (b) to offload work so that the CPU can concentrate on more generalized tasks. That's one reason SIMD/vector computing instructions were introduced: to better deal with common math functions that were used in programs of the day. It's recent Intel CPUs include AES-NI: because an increased need for security has pushed the use of AES so much we end up using it all over the place.

  18. ajny

    The price of the masks and the rest of the tool flow means this is a corporate endeavor. The marginal price of writing original software is only opportunity cost.

  19. Poncey McPonceface

    Nice one El Reg, made the front page of Hacker News!

    More discussion of the software/hardware divide over yonder.

    https://news.ycombinator.com/item?id=12555500

  20. Colin Tree

    Chuck Moore's Law

    They've discovered Chuck Moore's Law

    Less is More

    Software has layers of abstraction, so it doesn't matter what hardware you run an application on. Programmers want to focus on the high level task and not on the nuts and bolts.

    How wrong that paradigm is, and I hope it goes away.

    1. roger stillick
      Linux

      Re: Chuck Moore's Law

      Unfortunately the paradigm is exactly right, and it also applies to parts count on hardware..

      IMHO=less is always more if it actually works.. RS.

  21. roger stillick
    Linux

    Fast Internals Useless if it cant TALK

    AT&T's old single Power PC chip had communication channels internal to the chip to allow anything the processor did to be IO'd externally.. IBM's current Power 8 chip has 8 processors inside a wrapped around a non-blocking switching network on chip connected to a wideband IO MXR..

    Haswell and newer Intel chips have the Processors, alas they have a blocking communications system forcing space/ time/ space MXing for the IO stuff.. like limited instruction set boxes, they might appear to be faster, their data crunched throughput is really not much faster..

    Hypervisor Software and divided data streams allow these Intel chips to scream.. however the IBM Power 8 architecture CPU's runs simply as fast w/o special sauce software to make it work faster, or at all (sort of what this article implies= Hardware + Special sauce gives Moore's Law traction).. Happy C-64 ?? (still faster than my Haswell laptop)..

    IMHO= a non blocking network is needed on chip to take advantage of the many cores on a chip..RS.

  22. martinusher Silver badge

    The wheel has finally arrived

    When the PC turned up it was indeed neat to have a computer that was small and cheap enough to own. The price we paid for it was a return to hardware architecture that was twenty years out of date. Increases in hardware performance masked this giant step backwards, we got used to code bloat being managed by plummeting memory prices and blazing fast processors. (You'd be amazed at just how fast a generic Intel processor is when its not encumbered by the software we normally run on it.)

    We're finally moving into a world where we wanted to be in the 1980s, and its possible because -- finally -- hardware and tools don't require multi-million dollar investments to build anything, the blocks are cheap, the tools are cheap and the techniques are well understood.

    Its unfortunate that our software technology is still pretty crude -- in fact modern applications programming looks awfully like "chuck a load of mud at the wall and see what sticks". This might be a practical solution to getting the job done with the available resources but the size of modern programs for their functionality is embarrassing. (...and no, memory isn't cheap -- the parts are but the time taken to load and unload the stuff mounts up)

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like