100+ microseconds Seems way too slow
100 microseconds is a rediculously large overhead in the world of solid state media .. the protocol overheads of NVMe are about 5 microseconds, the wire latency of electricity is about a nanosecond per 30cm and the switch latency in ethernet is about 200 nanoseconds ...
If you look at OLD benchmarking from chelsio in the snip presentation here https://www.snia.org/sites/default/files/SDC15_presentations/networking/WaelNoureddine_Implementing_%20NVMe_revision.pdf. you'd see that there should only be about 8 microseconds of difference in NVMe inside of a server on PCIe and using the same I/O over a network .. end to end latency for a 4K I/O should be in the vicinity of 20 microseconds.
If you really want to geek out on this stuff, check out http://sc16.supercomputing.org/sc-archive/tech_poster/poster_files/post149s2-file3.pdf. which shows the actual latency differences between running RDMA traffic over Layer-2 vs TCP for a 4K I/O size should be about 5 to 10 microseconds if you're just measuring protocol level differences
As a benchmark of NVMe over fabrics using RoCEv1 or v2 vs using TCP its kind of uninspiring on all levels.