Relative to What?
"For example, NVMe has more than 200 μs lower latency than 12 Gbit/s SAS."
Relative to what? Is it 50 μs vs 250 μs or 800 μs vs 1000 μs or what?
NVMe fabric technology is a form of block-access storage networking that gets rid of network latency delays, magically making external flash arrays as fast as internal, directly-attached, NVMe flash drives. How does it manage this trick? EMC DSSD VP for software engineering, Mike Shapiro, defines NVMe fabrics as: "the new NVM …
" Chelsio tells us that a RoCE network must have PAUSE enabled in all switches and end-stations, which effectively limits the deployment scale of RoCE to single hop. RoCE does not operate beyond a subnet and its operations are limited to a few hundred metres, not a constraint in the environment (servers linked to external storage inside a data centre) for which it is being considered."
First I would worry about security, however... if limited to a single hop, then you could essentially make a shared nothing cluster (Hadoop/Spark) in to an MPP HCP super computer.
Its interesting because Spark is now poised to take advantage of this and we can see faster throughput in the big data space.
RDMA security relies on network level security *and* the local interface only enabling access to memory
as explicitly granted by the user. Both InfiniBand and iWARP APIs require this. The RDMA device does
not have access to any portion of physical memory it wants.
I would be surprised if someone designed a new API which did not match this security.
Applications can of course make application layer security mistakes. An RDMA interface is only
more vulnerable to the extent that eliminating bottlenecks would allow the applications to do their
work, and their mistakes, faster.
"An NVMeF deployment involves an all-flash array, adapters and cabling to link it to a bunch of servers, each with their own adapters and NVMeF drivers. The applications running in these servers require 20-30 microsecond access to data and there is vastly more data than can fit affordably in these servers' collective DRAM.
These characteristics could suit large scale OLTP, low-latency database access for web commerce, real-time data warehousing and analytics and any other application where multiple servers are in an IO-bound state waiting for data from stores that are too big (and costly) to put in memory but which could be put in flash if the value to the enterprise is high enough."
-=-
Most OLTP databases are very small in relative terms.
You could do this to build out a reservation system and have a very small cluster of small boxes serving a lot of clients. But it will take a lot of custom code and a team of people who actually know what they are doing.
You could then build distributed clusters to serve the customers as well as provide redundancy with cluster to cluster xfers taking normal time/latency.
In terms of costs... more than spinning rust, but not that much more than a rack of high end machines w tonnes of memory.
Please, for the love of $deity find a friend that actually does understand storage and have them peer review before you publish any more gibberish nonsense. You lure us in with interesting titles only for us to find that the dining table has a better grasp of the subject.
Sorry Chris I'm not trying to be mean but this stuff is ruining your reputation!!
First... is your alias a nod to the Sci Fi author Zelazny ? ;-)
To your point... The NVMe drives are already in the home lab.
The fabric and networking cards would be next, but you're starting to look at some serious $$$ for kit.
Where this comes in to play is in proving out some of the advances in Big Data tech.
But a 48port or smaller ToR switch and the cards (Mellonox, Solarflare), cables aren't cheap.
And they have very little value unless you're working for a startup and are doing this as a small 5-10 node cluster...
seriously didn't we learn from iSCSI?
If you take a good protocol and force it over an IP network;
then offer it for a fraction of the price of the equipment serving lower level protocols actually *built* for this;
At least two things will happen:
1) everyone will buy your cheep solution, well done, have a cooky and enjoy your holiday in Barbados.
2) no one will like it because it runs like half a dog. the protocol will die out and we will have another 20 years of SCSI sense code troubleshooting; like our fathers before us.