Re: NVMeoF
I disagree. The NetApp E570 has 100µs latency; that's significantly lower than a non-NMVe front end. I can't link in a prior reply to this from John Martin of NetApp, so here's a cut & paste. Long, but a good read.
----
The importance of end to end NVMe to media is a tad overinflated with NAND
The difference between NVMe and SAS protocols is about 20µs, and media access on flash drives using NVMe interfaces are still about 80µs on both NVMe attached and SAS attached drives. Hence adding NVMe attached NAND media might give you about 20µs of better latency which is good, but not really worth the hype that seems to be poured all over the NVMe discussion.
With NAND, anything offering lower latency level lower than 100µs going to be accessing the majority of its data from DRAM or NVDIMM or something else which isn't NAND Flash .. e.g. 30µs - 50µs for a write is going to NVRAM .. I don't have the numbers for an EF write, but I'm pretty sure its in the same ballpark [it's 100µs as I said above].
The other advantages NVMe has is more PCI lanes per drive (typically 4 vs 2 for one enterprise SAS drive) and better queue depths which don't get used in practice and don't seem to make a difference in the vast majority of workloads. I blogged about this here https://www.linkedin.com/pulse/how-cool-nvme-throughput-john-martin/ and here https://www.linkedin.com/pulse/how-cool-nvme-part-3-waiting-queues-john-martin/
The big benefit of NVMe from the client to the host is that it requires way less CPU to process I/O, so for environments like HPC where you are running the CPU's really hot, giving back a bunch of CPU cores to process 1,000,000's of IOPS is a worthwhile thing. It also helps on the storage array as well because processing NVMe on the target also requires less CPU which is often the gating factor for performance on the controller, but that CPU impact of doing SCSI I/O is a relatively small portion of the overall CPU budget on an array (vs Erasure coding, replication, snapshots, read-ahead algorithms and T10 checksums etc), so reducing the I/O CPU budget is going to have a useful, but hardly revolutionary improvement in controller utilisation, and having scale-out architectures are a better long term way of addressing the CPU vs performance issue for storage controllers.
As far as the apparent fixation on needing NVMe media to get the best out of flash, even ONTAP including inline compression and deduce with FCP at the front and SAS at the back end is able to achieve significantly better latency than a certain high profile all flash array that purely uses NVMe media. Getting significant performance improvements will be more about software stacks and media types than whether you can save 20 microseconds by moving from SAS to NVMe.
So, can things go faster by using NVMe end to end .. yes, will it be a revolutionary jump, no, unless you're using something way faster than typical NAND, but if you're going to do that, you're going to want to optimise the entire I/O stack, including the drivers, volume and filesystem layers which is where something like PlexiStor comes in.