Marketing Gibberish
Didn't we just call that a "server" in the old days?
Datrium is providing separate scaling of CPU power and storage capacity in its disaggregated converged storage arrays. These DVX arrays feature controller functionality running in accessing servers – compute nodes – which link to 24-slot drive JBODs – disk-based data nodes. The compute nodes run application software making …
This post has been deleted by its author
** Disclaimer: Datrium employee **
Thanks, Chris. Glad you liked my quote. On the question of what happens if all DVX Compute Nodes go down at once, it's simple. Datrium Compute Nodes are stateless; they use flash locally for speed, but persistence is off-host. If all hosts go down and then one comes back up, the restarted host just starts using data, and no rebuild is required. This is a contrast to HCI, where if a server goes down, rebuilds are required, and multiple servers down can stop data access for the others.
@buzzki11, we call it a Compute Node if you buy it from Datrium because it includes our software for data services; none of that runs in the Data Node. Even erasure coding and rebuilds for the Data Node drives are performed by hosts running DVX software. We just use terms this way to try to be clear. If DVX is deployed with third party servers, we'd upload that software.
More discussion here on DVX 3.0 features from Andre Leibovici: https://goo.gl/kas7ZH
This this true? because if your performance is contingent entirely on read cache, and you have nodes go down don't you have to re-warm that cash (or do a full cache integrity check). ZFS nodes with L2ARC back in the day dumped it on loss, and in many environments that were backed with slow 7.2K drives for capacity this would mean performance would suck for hours until cache could repopulate.
Read cache may not be "state" but if you lose it, and lose 98% of your read performance for 2 hours, life sucks and you are effective "down." Boot volumes and app binaries that are normally read cold (they sit in ram all day) also will have to be re-warmed even if you can manage to recover the Read Cache. I've seen hybrid systems cause cold reboots to take 4 hours because of this. You can't cheat physics.
@Anonymous Coward, you raise a good question.
*** Andre from Datrium ***
In DVX, we hold all data in use on a host on flash on the host. Moreover, we guide customers to size host flash to hold all data for the VMDKs. With always-on dedupe/compression for host flash as well, this is totally feasible – with just 2TB flash on each host and 3X-5X data reduction you can have 6-10TB of effective flash. (DVX supports up to 16TB of raw flash on each host). Experience proves this is in fact what our customers do: by and large, our customers configure sufficient flash on the host and get close to 100% hit rate on the host flash.
With any array, you have to traverse the network. However, with any modern SSD, the network latency can be an order of magnitude higher than the device access latency. Flash really does belong in the host, especially if you are talking about NVMe drives with sub-50usec latency.
What about a host failure? Because all data is fingerprinted and globally deduplicated, when a VM moves between servers there is a very high likelihood that most data blocks for similar VMs (Windows/Linux OS, SQL Servers, etc.) are already present on the destination server and data movement will not be necessary for those blocks.
DVX uses a technology we call F2F (flash-to-flash) – the destination host will fetch data from other hosts flash and move the data over to the destination host if necessary. DVX can read data from host RAM, host flash, or Data Node drives (or, during failures, from Data Node NVRAM). DVX optimizes reads to retrieve data from the fastest media. You lose data locality for the period during which this move happens, but it is restored reasonably quickly.
In the uncommon case i.e., after a VM restart on a new host (or after a vMotion) and the data not available in any of the hosts flash, the DVX performance will be more like an array (or some other HCI systems without data locality). DVX optimizes the storage format in the Data Node drive pool for precisely this situation. VMs tend to read vDisks in large consecutive clumps, usually reading data that was written together. Large clumps of the most current version of the vDisk are stored together. These are uploaded as a contiguous stream upon the request of any individual block, providing a significant degree of read-ahead as a vDisk is accessed. Subsequent reads of the same blocks, of course, will be retrieved from local flash rather than from the Data Node.
That is, the DVX worst case is someone else’s best case.
This post has been deleted by its author
This is very impressive technology but completely misses the mark. The vast majority of applications do not benefit from microsecond latency. Those that do, which are mostly databases, have their own mechanisms to deal with high speed read cache. As a matter of fact, DBAs are violently opposed to having the storage subsystem manage caching and use their own built in mechanisms. SQL's buffer extension pools come to mind.
This reminds me of Tesla's ludicrous mode, which besides being a nice party trick has no real application in real life.
This post has been deleted by its author
This post has been deleted by its author
OMG people...
When will we learn from other's mistakes ?
How long will we find dollars in this dying technology ?
We just had an example today.
Someone places storage at the other end of network, someone in server, someone attaching it to server, someone even out of their network. WTF guys, we are still calling it innovation ?