I'm not really seeing the innovation here...
I'm all for bigger, better, faster and more...but it seems like a really long way to describe the difference between CPU registers, L1, L2, L3, L4 cache, then ram...but instead of then traversing the IO stack to network or SAS or PCIe you just extend the address space to NAND interfaces if available and then go to disk or nas if there is a cache miss.
Sounds like the OS and hardware vendors should just bake that into the system as a teir, no need to hack or trick anything. The capability didn't exist before, so it wasn't integrated. Now it is, so let's just extend the protocols and standards and keep on trucking...