back to article Hopping the flash stepping stones to DIMM future

Up until very recently the main thrust of data access and storage technology development was to make the media faster, hence the move from disk to flash, and flash technology developments to increase capacity. But the opportunity to make further advances is closing and attention is turning to removing the final obstacle to …

  1. Pascal Monett Silver badge

    "it will be here in a flash"

    I can believe that ; unlike any claims about cabon-nanotube whatevers that have been trumpeted for the past decade at least and there's still nothing on the horizon.

    I'm sure that there are just as many billions involved and the potential is just as enormous, but somehow the entire storage industry seems to have a rocket strapped to its R&D, whereas energy is putting along in a golf cart.

    1. Aghios Vasilis tou Stalingrad

      Re: "it will be here in a flash"

  2. Dave Pickles

    So the NV storage gets mapped directly into the processor's address space, and DRAM becomes another layer of cache. Neat, although we'll eventually need 128-bit addressing on the CPU to keep up with the storage.

    1. Stoke the atom furnaces

      Move from 64 to 128bit addressing?

      I would say that 16 exbibytes of memory (2^64 bits) is enough for anybody :-)

      1. Steve Chalmers

        64 bits isn't close to enough

        Today's high capacity SSD is about 16TB (2^44 bytes). Assuming perfect, dense use of address space with no set asides, a million of those drives would completely fill that address space. For a single system, again assuming perfect dense use of address space, I tend to agree that 64 bits will last a long time.

        However, for shared memory semantic storage, it is not unreasonable to consider a modern Google class data center as having 100,000 servers and at least 200,000 disk drives. That means in a shared environment it is entirely possible to exhaust a 2^64 address space in a data center built by 2020. I haven't done the analysis myself, but would tend to point to at least a factor of 2^10 more (ie a 74-bit-ish address space) in any chip or fabric design intended for the 2020s.

        Note that for fault containment reasons I doubt the entire storage fabric space will be mapped into any one CPU's address space: this address is not the size of a pointer or index register used by applications, it's the size of the address coming out of the MMU or similar mapping table between the CPU and the memory semantic fabric. (Some supercomputer app will prove me wrong here, but I still think there's no rush to re engineer CPUs the way we did from 32 to 64 bit pointers some decades ago.)


  3. Anonymous Coward
    Anonymous Coward

    Great article

    Being so far removed from the tech side of things these days, this was an enjoyable and informative article, thanks!

  4. Steve Chalmers

    Control plane for shared (networked) DAX storage?

    The limit as the stack approaches zero instructions executed per persistent read or write, when the byte addressable persistent memory is shared by many applications running on many servers, is that we will have what network people would call a "control plane" spanning server, memory semantic network, and storage system.

    The "control plane" (the drivers) would set up protection and mapping tables to give specific (user space) processes read and/or write access to specific regions of SCM in the storage system. (The "data plane" as a network person would call it is thus read and write operations being performed directly by the application (more likely its libraries) to the SCM itself, with no intermediaries executing lines of code -- not in the server, not in the network interface, not in the switch, and not in the storage system.)

    There is an example of how such tables would work in the Gen-Z spec, but I would expect an industrywide control plane to work equally well with PCIe based fabrics, as well as memory windowing and its descendants on InfiniBand and Omni-Path, and other similar technologies reused from supercomputer fabrics.

    So who's working on a piece of code that can be the start of this control plane for storage? Hint: if done right, very efficient container storage falls out: storage access permissions aren't controlled by LUN by server as we did in Fibre Channel, they're controlled by memory range (more likely page table) and process, as is sharing of memory between processes within a single server. What we do not need as an industry is 100 venture funded startups each coming up with their own proprietary way to do this...


  5. Stoke the atom furnaces

    Thanks for the Memory (article)

    Great article!

    I guess at some point larger on chip CPU SRAM caches and low latency high bandwidth Flash based non-volatile storage combine to make DRAM redundant for many low end systems.

    I still wonder though why the DIMMs fitted to current high end machines have not yet dumped DRAM in favour of SRAM. For applications that are not so sensitive to price and power consumption (a high end gaming rig for example) the lower latency of the SRAM would give a useful performance boost.

    1. Steve Chalmers

      Re: Thanks for the Memory (article)

      The SRAM is now on the CPU die, as L3 cache. The path from the CPU's memory pipeline, out the pins to the DRAMs and back has been so hyper-optimized around the way DRAMs work that the latency savings from using SRAM instead of DRAM at this point would be insignificant. That was absolutely not the case when I designed with SRAMs and DRAMs in the 1980s. Oh, and the static power consumption per bit in real SRAM designs of that era would simply melt a chip at today's densities, so it's not exactly SRAM...

      It will be interesting to see what happens over the next decade as the various storage class memories emerge, first as storage, and ultimately (possibly) to displace DRAM. That will require a change in the interface between CPU and memory -- Gen-Z (which I worked on) is an example of a different interface. Will be interesting to see if latency-optimized (rather than density optimized) memory devices using some emerging SCM technology, combined with a new interface (think hybrid memory cube stacked on the CPU die itself, with aggressive cooling) accomplish what you have in mind in, say, 2020 or 2023.

  6. Rik Myslewski


    Chris, thanks for an excellent, readable, thorough, and eminently understandable article. Having left El Reg over two years ago, and having since devoted all my tech energies towards climate-change science rather than computing, it was pure pleasure to catch up on what many of us in the tech world have seen for decades as the Holy Grail — well, one Grail, at least ("He's already got one! It's very nice ...") — of HPC: the eventual merging of mass storage and direct CPU-addressable memory, preferably in multi-server fabrics (is that still a reasonable bit of descriptive prose? [I'm frightfully out of the loop ...]).

    Two quick questions, though: you mention phase-change memory — is that still undead? And how's Crossbar and their ReRAM doing, financially?

  7. Tom 64

    Nice round up, thanks

    This NVDIMM enabled brave new world looks great, and I have but one question:

    - Will I be able to run crysis on it?

    ... Mines the one with the cloaking device.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Biting the hand that feeds IT © 1998–2021