????
Didn't I read this yesterday? Or has someone just reconfigured the Matrix . . .
HPE's Machine research project has debuted an ARM-powered, 160TB monster memory system prototype, calling it the world's largest single-memory computer. Back in November, HPE built a proof-of-concept prototype of The Machine and said it no longer intended to sell an actual product version but would integrate its component …
HPE's Machine research project has debuted an ARM-powered, 160TB monster memory system prototype, calling it the world's largest single-memory computer.
And I call it bullshit. This is a cluster with interconnect supporting RDMA access - a standard feature of every recent HPC interconnect I can think of. For example, by popping over to https://www.top500.org/statistics/sublist/, and searching for Architecture=Cluster (so that we don't get exotic accelerator-based solutions, which may not have uniform RDMA access to all memory), we get:
#1 = Tianhe-2 (MilkyWay-2), with 1,024,000 GB RAM. The interconnect is listed as "TH Express-2"; by popping over to http://dl.acm.org/citation.cfm?id=2632731, we see that it indeed supports RDMA.
Continuing down the list, we have:
#4 = K computer, with 1,410,048 GB and Tofu interconnect. Quick search on the Tofu gives us http://ieeexplore.ieee.org/document/6041538/, which confirms that Tofu supports RDMA as well.
We can continue to numbers 10 and 13 (both featuring FDR Infiniband, with RDMA access and more than 160GB of RAM), and further down the list, but I think the point is made.
Unfortunately, even before HP has fissioned into its present form, it has successfully managed to rid itself of any real R&D capability. All what is left is the ability to rebadge hardware made by others and to bullshit about the next break-through coming really soon now.
Errr.. remember that HPE recently merged with SGI. So have access to the SGU Numalink interconnect, which is used to construct really, really large proper NUMA systems.
It would look to me like HPE have used the Numalink chips on ARM boards.
I'm speaking as someone who has managed Origin 200, Origin 2000, Altix 4000 and Ultraviolet systems over the years, so I might just have a clue here.
... SGI Numalink interconnect, which is used to construct really, really large proper NUMA systems ...
Numalink is indeed a very impressive technology. Unfortunately, its global memory coherence magic comes at a substantial cost: you need to maintain memory directories and to devote a potentially significant fraction of your fabric to the cache-coherence traffic.
In the end, the speed of light still gets you: once the ratio of local/remote memory latency is large enough, you frequently have to rewrite your code in terms of message-passing anyway - and for that, the benefits of globally coherent memory over rdma are marginal, and for large configurations possibly negative.
At least for some large O2k and early UV systems I've seen [SGI UV moved outside of my price range some time ago], it was not unusual to fire up the whole system once for the benchmark/bragging rights, and then separate it into a cluster of smaller systems (each with better worst-case latency and bisection bandwidth) for production.
There are some very real physical reasons why we have developed storage-level hierarchies; a uniformly-addressable flat address space is not coming back, as appealing it may sound theoretically.
But until you have mem-resistor / crossbar technology at scale... an all memory computer like this would still crush your Hadoop cluster, even those running a poor man's HPC set up.
The point is that you can keep your relevant data sets in 'memory' and still perform number crunching using them.
To be fair... I don't see a lot of use cases where this is really needed.
All about the number of bits and hardware like this.
Current 64bit Linux kernel has limit to 64TB of physical RAM and 128TB of virtual memory (see RHEL limits and Debian port). Current x86_64 CPUs (ie. what we have in the PC) has (virtual) address limit 2^48=256TB because of how the address register in the CPU use all the bits (upper bits are used for page flags like ReadOnly, Writable, ExecuteDisable, PagedToDisc etc in the pagetable), but the specification allows to switch to true 64bit address mode reaching the maximum at 2^64=16EB (Exa Bytes). However, the motherboard and CPU die does not have so many pins to deliver all 48 bits of the memory address to the RAM chip through the address bus, so the limit for physical RAM is lower (and depends on manufacturer), but the virtual address space could by nature reach more than the amount of RAM one could have on the motherboard up to virtual memory limit mentioned above.
They receive the 160 TB (40x4) like this:
Each of the 40 nodes consists of two connected boards: a Fabric Attached Memory Board and a compute board. Each Fabric-Attached Memory board consists of four Memory Fabric Controllers, with 4TB of memory per node, and Fabric-Attached Memory. Each compute board consists of a node processor System-on-a-Chip, almost three terabytes per second of aggregate bandwidth, and a local fabric switch.
https://community.hpe.com/t5/Behind-the-scenes-Labs/Making-the-mission-to-Mars-compute/ba-p/6964700#.WRsY4GgrJPY
... However, the motherboard and CPU die does not have so many pins to deliver all 48 bits of the memory address to the RAM chip through the address bus ...
It is getting pretty close though - according to LGA 2011 datasheet, current generation of E7 Xeons support up to 46 bits of physical address space (which, somewhat unsurprisingly, matches the 64TB Linux limit you mention above). The previous-generation socket (LGA 1366) supported "only" 40 physical address bits, so one should expect that the next Xeon socket update will increase the physical address space again, possibly beyond 48 bits.
HPE CEO Meg Whitman went all misty-eyed on Big Data analytics. "The secrets to the next great scientific breakthrough, industry-changing innovation, or life-altering technology hide in plain sight behind the mountains of data we create every day. To realize this promise, we can't rely on the technologies of the past, we need a computer built for the Big Data era."
42.
HPE CEO Meg Whitman went all misty-eyed on Big Data analytics. "The secrets to the next great scientific breakthrough ... hide in plain sight behind the mountains of data we create every day.
From my limited first-hand experience with scientific breakthroughs, this statement appears to be false. Most, though obviously not all, scientific developments come from two things coming into conjunction: new ideas, born in the brains of scientists, and new technological capabilities - things like lasers, magnets, centrifuges, vacuum pumps, gene sequencers, rockets, reactors, submarines, high-pressure anvil cells, clocks, balances, and, yes, sometimes computers.
At the same time, computers are often the biggest obstacle to scientific development: as often as not, they encourange scientists to be lazy and sloppy in their thinking and planning, using big computers and big data to make up for the deficit of understanding. Good fraction of computational studies coming across my desk evaluate, at a great expense, things which couple of hours of thinking would demonstrate to be either zero or even meaningless. Not having a honking big computer would have been an advantage in thouse cases ...