Reply to post: Digging for use cases?

Pavilion compares RocE and TCP NVMe over Fabrics performance


Digging for use cases?

Ok, let’s kill the use case already.

MongoDB... you scale this out, not up, MongoDB’s performance will always be better when run with local disk instead of centralized.

Then, let’s talk how MongoDB is deployed.

It’s done through Kubernetes... not as a VM, but as a container. If you need more storage per node, you probably need a new DB admin who actually has a clue.

Then there’s development environment. When you deploy a development environment, you run minikube and deploy. Done. No point in spinning up a whole VM. It’s just wasteful and locks the developer into a desktop.

Of course there’s also cloud instances of MongoDB if you really need something online to be shared.

And for tests... you would never use a production database cluster for tests. You wouldn’t spin up a new database cluster on a SAN or central storage. You’d run it on minikube or in the cloud on Appveyor or something similar.

If latency is really an issue for your storage, instead of a few narrow 25Gbe pipes to an oversubscribed PCIe ASIC for switching and an FPGA for block lookups, you would instead use more small scale nodes, map/reduce and spread the work-load with tiered storage.

A 25GbE network or RoCE network in general would cost a massive fortune to compensate for a poorly designed database. Instead, it’s better to use 1GbE or even 100MbE to scale the compute workload into more small nodes. 99% of the time, 100 $500 nodes connected by $30 a port networking will use less power, cost considerably less to operate and perform substantially better than 9 $25,000 nodes.

Also, with a proper map/reduce design, the vast majority of operations become RAM based which will drastically reduce latency compared to even the most impressive NVMe architectures based on obsessive scrubbing. Go the extra mile and make indexes that are actually well formed and use views and/or eventing to mutate records and NVMe is a really useless idea.

Now, a common problem I’ve encountered is in HPC... this is an area where propagating data sets for map reduce can consume hours of time given the right data set. There are times where processes don’t justify 2 extra months of optimization. In this case, NVMe is still a bad idea because RAM caching in an RDMA environment is much smarter.

I just don’t see a market for all flash NVMe except in legacy networks.

That said, I just designed a data center network for a legacy VMware installation earlier today. I threw about $120,000 of switches at the problem. Of course, if we had worked on downscaling the data center and moving to K8s, we probably could have saved the company $2 million over the next 3 years.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Biting the hand that feeds IT © 1998–2022