back to article AMD, Nvidia, HPE tapped to triple the speed of US weather super with $35m upgrade

HPE will upgrade the US National Center for Atmospheric Research's (NCAR) supercomputer using AMD and Nvidia's latest CPUs and GPUs, creating a machine roughly three times as powerful as its current Intel-based beast. That system running today, code-named Cheyenne, is four years old, and now its government-funded lab wants a …

  1. MrReynolds2U

    Any experts in the house?

    "It’ll have a total RAM capacity of a whopping 692TB. The connectivity between the nodes is powered by HPE’s Slingshot interconnect architecture, boasting a bandwidth of 200GB per second."

    I'm not great at bandwidth <=> storage relationships but does the bandwidth sound like a limiting factor to those in the know? Obviously it depends upon unit of work though.

    1. diodesign (Written by Reg staff) Silver badge

      Bandwidth

      Er, yeah, that 200Gb/s figure is in no way the entire internal bandwidth of the system. It's basically the base speed per port.

      As the linked-to article and this paper [PDF] and HPE's own bumf says, that's the link speed of the interconnect. You can have, eg, a 64-port switch with each port doing 200Gb/s per direction (four lanes of 50Gb/s).

      NCAR also said the "HPE Slingshot bandwidth is 200 Gb/sec per port per direction."

      C.

      1. MrReynolds2U

        Re: Bandwidth

        Thanks @diodesign, I'll take a look

  2. Sceptic Tank
    Trollface

    Start up the Corona

    I propose calling it the Corona Crysis .... that is to say if it can run Crysis.

    1. Anonymous Coward
      Anonymous Coward

      Re: Start up the Corona

      Mention Doom 3 is dark too, that's another really up-to-the-minute, bleeding edge gamer joke

  3. This post has been deleted by its author

  4. Julian 8

    Just mine a few bitcoins to pay for itself

  5. MGyrFalcon

    Kernel architecture

    I'm not up on super computer architecture, so I have a question that's been tickling the back of my mind for a while. Do setups like list have a monolithic NUMA kernel or are they setup more like a Beowolf cluster with a orchestration node passing out tasks to all of the compute nodes? Seems to me both would have their advantages.

    1. Ima Ballsy
      Angel

      Re: Kernel architecture

      As someone who works in those environments some primers :

      https://www.netapp.com/data-storage/high-performance-computing/what-is-hpc/

      https://www.weka.io/learn/what-is-an-hpc-cluster/

      Hope this helps

      1. MGyrFalcon
        Happy

        Re: Kernel architecture

        Thanks, that does help. They are Beowolf-ish

  6. Chris the bean counter

    Obsolete before delivery?

    Sounds like they got lazy and bought it to be able to run their current CPU orientated software.

    May have been better spending some of the money on software updates and buying a computer with more GPU and a lot less CPU.

    (Although as an AMD shareholder I am delighted at their choice).

    1. A random security guy

      Re: Obsolete before delivery?

      In theory, I agree. However, software rewrites take more time. This could be the one where lazy gets you 3.5 speed increase this time. And then the rewrite gives you a bit more. And in 2-3 years they will order the next system where they will have many more GPUs ...

      Wouldn’t be surprised if they already gave the teams working on the specs. And Intel desperately trying to get into the game.

      Live competition.

    2. hoola Silver badge

      Re: Obsolete before delivery?

      The GPU and CPU workload will be different. They may have a mix of both and going all out on GPU then compromises the basic compute too much.

      Cool things like shared memory so you can stop and change parameters on a model part way through a cycle without having to reload the data set will make it more productive.

  7. Anonymous Coward
    Anonymous Coward

    Windows 10 ?

    Can one assume it won't be running Windows 10?

    1. Anonymous Coward
      Anonymous Coward

      Re: Windows 10 ?

      It's a government system - so Windows XP!

  8. Nate Amsden Silver badge

    Same network speed as current system?

    This article says the network links are 200 Gigabits on the new system.

    However apparently their current system has 25 GB/s:

    https://www2.cisl.ucar.edu/resources/computational-systems/cheyenne

    "Partial 9D Enhanced Hypercube single-plane interconnect topology Bandwidth: 25 GBps bidirectional per link"

    25 Gigabytes * 8 = 200 Gigabits.

    Would of thought things would be faster on the newer system. That bidirectional statement may imply the current system is just 100Gbit in each direction and the new system is perhaps 200Gbit in each direction? that would be a good boost.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Biting the hand that feeds IT © 1998–2022