back to article Hot iron: Knights Landing hits 100 gigaflops in plasma physics benchmark

Russian researchers working with an Intel supercomputer have put its Knights Landing hardware through its paces, and are pleased with what they've found. The boffins reckon the many-cored hot rod more than doubled their plasma simulation performance with nothing more than a simple recompile. Here's what they had to say at …

  1. Cuddles

    "The researchers explain that they chose a particular plasma simulation, “particle in cell” because it's a well-studied problem on many-core architectures."

    No they don't. Particle-in-cell isn't a problem at all, let alone a well studied one, it's a category of solver type. Just look at the very first line of the abstract:

    "particle-in-cell laser-plasma simulation is an important area of computational physics"

    There's a huge difference between a specific well-studied problem and a broad technique that is used to investigate both well-studied and never-before-studied problems.

  2. Anonymous Coward
    Anonymous Coward

    So as ever...

    ...decent code and design makes things run better.

  3. Anonymous Coward
    Anonymous Coward

    NASA has run out of names.

    "NASA's Endeavor supercomputer." Endeavor was a space shuttle.

    1. Anonymous Coward
      Anonymous Coward

      Re: NASA has run out of names.

      Hmm... I wonder if there was an Endeavour before that too?

  4. Mike Shepherd

    Plasma simulation

    So, finally, free electricity from fusion really is "just around the corner"? Hello, hello? Operator...?

  5. Tom 7

    100GFLops at what power?

    The Adapteva parrallella 64 core jobie can do 102GFlops for 2W

    1. Mark Honman

      Re: 100GFLops at what power?

      Tom, sorry to say that those Adapteva numbers are "guaranteed never to exceed" ones; in this case even more so than usual because Adapteva didn't get enough good 64-core Epiphany-IV chips to fulfil the kickstarter orders.

      102GFlops corresponds to all cores doing solely fused multiply-add operations, and ignores the problem of where the data is coming from (i.e. nothing like any real application or even a benchmark). On Parallella the DDR is attached to a Zynq ARM+FPGA hybrid meaning about 300MB/s maximum RAM bandwidth, and the Zynq uses about as much power as the Epiphany.

      IIRC the 2W figure is for the 16-core Epiphany-III - but it is still a good GFlops/W figure.

      But Kudos to Adapteva for trying - I had high hopes for the Parallella when it came out, but to my surprise it led me into the world of FPGAs.

  6. JustNiz

    100GFLops? Meh. The new Pascal-based nVidia TitanX is doing 11 TerraFlops.

    1. CarbonLifeForm

      But you have to refactor your code before you have a prayer of achieving those numbers, which may be months of work.

      I'm not an Intel fanboy. But my experience has been that making applications run faster rarely is about making one piece of it go blazing fast.

      Most applications I've ever dealt with have been bandwidth limited, and the bandwidth limitation (once you push it out of the CPU or GPU) ends up in the memory, and once you push it out of the memory, it ends up in the mass storage.

      So you $pend untold hours refactoring everything to make it run on the GPU because you're seeing this tenfold increase in kernel performance, and then your application is only 15% faster because the real problem is the connection between the supercomputer and mass storage.

      Then you use one of these chips gets from Intel and maybe you only get 7% better, but you barely had to do any work in comparison other than make sure that you're vectorizing and stuff by looking at what the silly compiler is telling you.

      And you notice the problem with the mass storage earlier and address it anyway!

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like