back to article Samsung testing memory with built-in processing for AI-centric servers

Samsung has advanced its plans to relieve devices of the tedious chore that is moving data out of memory and into a processor – by putting more processing power into memory. It's already running in servers and should become a standard of sorts next year. The Korean giant's efforts use its very fast Aquabolt high-bandwidth …

  1. Julz Silver badge

    What

    Goes around, comes around. CAFS meet AXDIMM your spiritual successor. Not that CAFS was either the first nor unique. These 'innovations' are the inevitable consequence of a difference between the speed of fast processing verses the slow speed of memory access. Not sure what would happen if memory was way faster than processing.

    https://www.cdpa.co.uk/CAFS/

    1. Warm Braw Silver badge

      Re: What

      Not sure what would happen if memory was way faster than processing

      In a way, that was the case with CAFS: the storage (in this case the disk and its controller rather than RAM and its controller) was faster than the CPU which is why the task could be offloaded.

      In principle, you can address the "faster than" and "slower than" mismatch simply by having more processors. If the memory is sufficiently fast, you can attach multiple processors to the same bus. If it's sufficiently slow, you can couple chunks of it up to equally slow individual processors that then communicate between themselves to share the results of their parallel computations.

      So why did we end up with fast processors with increasingly complex cache hierarchies to optimise their use of slow memory? It's partly because the kind of tasks we habitually give to CPUs demand single-thread performance and partly because wiring up systems with lots of processors with their own memory is more expensive than dumping some cache on a CPU die.

      Now that we have more parallelisable workloads, single-thread performance isn't necessarily such a benefit. Integrating processing capability into the memory solves the wiring problem.

      1. katrinab Silver badge
        Paris Hilton

        Re: What

        Why not make the processor cache bigger? Like measured in 10s or 100s of GB so you can have everything stored there. Maybe too expensive for regular computers at the moment, but I guess not so much of an issue for High Performance Computing workloads.

        1. Julz Silver badge

          Re: What

          There is always a balance involving such things as size of cache, how many cache lines, cache coherency overhead, physical or virtual addressing, sharing caches, moving between caches, cache flush overhead, cost of cache miss etc. Bigger is not always better ;)

  2. AMBxx Silver badge
    Childcatcher

    New problem

    New buffer overflow bug approaching.

  3. Binraider Silver badge

    It's not so far removed a concept from a beowulf like cluster.

    Moving processing to RAM makes some degree of sense to reduce travel time, although for processes where the data required exists on more than one ram chip the limiting factor will be the interconnection between elements. Potential for data races could be high in a decentralised processing model.

    In something like this, does the role of the "traditional CPU" become one of task scheduler?

    Interesting to see what samsung does with this, though PC industry being what it is I'd much rather see a standards committee along the lines of PCI-SIG create the framework.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Biting the hand that feeds IT © 1998–2021