back to article We pick a storage CTO's brains on Linux-heads, big vendors – and should all the admins NVMe?

NVMe storage networking (NVMe-oF or NVME over fabrics) promises to radically cut the latency of access to external block storage compared to iSCSI and Fibre Channel. It works by implementing an RDMA (Remote Direct Memory Access) request to an external storage drive, bypassing the traditional host server and source array …

  1. muliby

    NVMe/TCP is coming

    NVMe over TCP/IP, or NVMe/TCP as we like to fondly call it, is thundering down the standardization route at NVMe.org and will be with us soon. All of the benefits of storage disaggregation, moving direct-attached SSDs out of compute boxes and into storage-optimized boxes, with your standard data center network. No RDMA or FC required -- just plain ol' Ethernet and TCP/IP. I/OPs, throughput, average latencies, and tail latencies that are so close to RDMA-based NVMe-oF that your applications simply won't know the difference.

    For more information and a demo, talk to us at Lightbits Labs, http://www.lightbitslabs.com.

    Muli Ben-Yehuda, Lightbits Labs Co-Founder and CTO

  2. curtbeckmann

    The NVMe over FC cookbook you mentioned... plus iNVMe...

    Good comments from Greg. The end of the article comments on the need for a cookbook to transition to FC-NVMe... There is such a book, though not much cooking is required, since recent SANs and HBAs offer concurrent support for both SCSI and NVMe over Fibre Channel. Check out "NVMe over Fibre Channel for Dummies", in ebook and print form. Full disclosure, I work at Brocade/Broadcom, and I'm one of the authors. There are several editions, some cobranded with storage OEMs along with Brocade. The book went to print before the NVMe-over-TCP spec effort got much attention. I agree with Muli that this will be a popular standard (when finalized, probably later this year) because, like FC-NVMe, it'll work on your existing infrastructure. Just as FC-SCSI and iSCSI coexisted for 15 years or so, you can expect FC-NVMe and "iNVMe" to coexist. That is, I see NVMe-over-TCP as exactly parallel to iSCSI, and so nickname the new protocol as "iNVMe", and I expect it'll be far more popular than the RDMA-based Ethernet options. Note that, as with iSCSI, there's still that gap around name services that Greg mentioned.

  3. Anonymous Coward
    Anonymous Coward

    What is the business case? Otherwise this will go the route of other high-speed blah blah - low adoption == high price == EOA

    1. JohnMartin

      The business case is pretty good.

      The main benefits of moving to NVMe on the host (as opposed to just shoving NVMe drives into an array while still using SCSI protocols over FC or iSCSI) are

      1. Lower latency

      2. Lower CPU consumption on the host

      3. No need to manage queue depths because they queues are effectively infinite

      None of that will make much difference if you're using disk or are happy with 1-2 millisecond access times or only doing about 10,000 IOPS per host, but if you're doing some heavy duty random access like using your array to run training workloads for deep learning on a farm of NVIDIA-DGX boxes, then those things make a big difference.

      Plus more performance, and lower overheads from a straightforward software upgrade (which is what moving from FC to NVMoFC should be) is a nice win.

      I wrote some of this up in detail here https://www.linkedin.com/pulse/how-cool-nvme-part-4-cpu-software-efficiency-john-martin/

      1. Anonymous Coward
        Anonymous Coward

        Re: The business case is pretty good.

        1. Lower latency? Go HCI then. Can't get much lower latency than local PCIe.

        2. Lower CPU consumption on the host? Throw one extra host at the problem and you've got enough CPU to run HCI as well as all the other benefits.

        3. No need to manage queue depths because they queues are effectively infinite? No business benefit there I can think of.

        4. heavy duty random access like using your array to run training workloads for deep learning on a farm of NVIDIA-DGX boxes, then those things make a big difference? So for a few customers then, not everyone.

        5. which is what moving from FC to NVMoFC should be? Isn't everyone moving away from FC? Infiniband didn't win, why will this?

        1. Fazal Majid

          Re: The business case is pretty good.

          Indeed. The whole point of NVMe is to reduce latency by getting rid of legacy SCSI command bloat. Adding the latency of FC or Ethernet would be a huge step backwards, which is why it makes no sense to anyone other than storage networking vendors in denial about their irrelevance in an era of microsecond latency.

        2. Anonymous Coward
          Anonymous Coward

          Re: The business case is pretty good.

          1) You don't get lower latency from HCI in my experience compared to SAN. In fact for writes it is often noticeably worse (writes are mirrored/replicated to at least 1 more node over an ethernet network in HCI).

          2) Throw another host and licence it for various of bits of software (HCI/VMware/application) instead of reducing the CPU overhead by stopping pretending you're writing to individual spinning disks. Smart move.....

          3) Latency can be induced by requests being serialised by limited queues. Applications see latency. We run applications for business benefit. There's the linkage you were looking for.

          4) Any latency sensitive application that has fast storage would benefit. Transit latency didn't matter when it was 40 microseconds out of 5000 microseconds (aka 5 milliseconds) - it was a tiny rounding error.

          When the storage response time is 150 microseconds, perhaps you should start paying attention as it's a decent percentage of overall response time.

          5) Not everyone is moving away from FC. Moving to Infiniband means purchasing new switches and dealing with stringent distance requirements. Existing FC users starting to use FC-NVMe for latency sensitive workloads just needs to confirm the OS, HBAs and your storage can talk FC-NVMe. They have NO SWITCH HARDWARE COST if they have Gen 5 or Gen 6 switches (which is pretty much anything purchased in the past ~ 6 yrs).

          End of lesson.

          1. Anonymous Coward
            Anonymous Coward

            Re: The business case is pretty good.

            1) In HCI - writes are mirrored across the network and reads are local. Now compare to reads and writes getting sent across the network to an array that then needs to mirror writes between controllers, over a network. Which sounds quicker to you?

            2) It's a cost analysis. You work out the hellish cost of your array and compare to the cost of a card or two in each server. Then the cost of an extra server + card.

            3) And if I put that storage inside the same box as the compute, I don't even need to worry about that latency. I'm not convinced the business angle has been covered.

            4) See 3.

            5) Not everyone, but lots. Will any come back? Back to business benefits, what real world large scale actual customer workloads need this and will find it worth the cost with the lock-in that goes with it?

            1. Anonymous Coward
              Anonymous Coward

              Re: The business case is pretty good.

              1) Writes in controllers are normally mirrored over PCIe - does that sound quicker than Ethernet? Do you know what the fabric latency is in a FC normal environment (hint, it's not a lot)?

              Just because something sounds quicker, doesn't mean it is quicker, which reminded me of this quote:

              "We all know that light travels faster than sound. That's why certain people appear bright until you hear them speak."

              2) cost analysis - for sure - if you have a tiny number of hosts, shared storage is hard to justify. "Hellish" suggests you've been reading too many VSAN marketing slides where they push the BS line that SSDs in storage arrays cost 22.48 times as much as SSD in x86 servers. If you had to use a number for that illustration, 1 times as much is a lot closer to the truth.....

              3) If you put all the storage in the box with CPU - cool idea. How do I make sure that I don't lose transactions or cope with server downtime/workload migration? Sounds very enterprise. Fancy a job at TSB doing their infrastructure?

              4) See 4:

              Any latency sensitive application that has fast storage would benefit. Transit latency didn't matter when it was 40 microseconds out of 5000 microseconds (aka 5 milliseconds) - it was a tiny rounding error.

              When the storage response time is 150 microseconds, perhaps you should start paying attention as it's a decent percentage of overall response time.

              5) Cost for existing FC users to try FC-NVMe is pretty near zero in a lot of cases with extra performance being the reward. What's the lock-in again?

              Let's look at the quote from the article again:

              "FC-NVMe is ideal for existing enterprises that already employ Fibre Channel, where NVMe-oF over RDMA Ethernet is best suited for green-field "Linux" environments."

              What's your problem with the above and do you have a view on how come Greg hasn't been found out earlier in his 40 yr career?

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like