back to article DataCore’s benchmarks for SANsymphony-V hit a record high note

DataCore's SANsymphony-V software with a parallel IO feature has produced a radical advance in price/performance on the SPC-1 benchmark. SPC-1 is an objective test and measures the data access performance of a storage subsystem. It enables meaningful performance comparisons to be made between different vendors' systems. The …

  1. Anonymous Coward
    Anonymous Coward

    Wrong Benchmark ?

    Single server only, with over half a terabyte of ram tested against 2.6TiB's of data on local storage.

    How can this be classed as a shared storage array and how is it in any way comparable to pretty much any other SPC-1 result out there ? They even had a single boot and swap drive on that single point of failure server in order to keep the cost down.

    It might have been an even better results had they removed the overhead of DataCore altogether, since it seems it added almost zero value in terms of data services to the result.

  2. Bronek Kozicki

    Tiers ?

    OK so DataCore supports many tiers, however in this case configuration (if you look at page 18 of the raport) is rather funny : 10 x 480GB SSD as tier 1, 6 x 480GB SSD as tier 2 and (surprise) only 8 x 300GB HDD as tier 3.

    More realistic configuration would presumably use smaller tier 1 , but perhaps flash NVMe rather than SSD (10 x SSDs in question cost ~ $3300 [*] with total 4.8TB capacity and 2TB Intel P3700 would cost ~ $4300), followed by tier 2 on SSD, followed by large tier 3 on HDD.

    I do realize that there are lies, statistics and benchmarks and also I do appreciate the transparency of reporting, but realistically who would put only 2.4TB HDD on the slowest tier, when SSD tiers are 7.7TB ?

    [*] funnily enough, the report gives (on page 16) $459 as list price for these SSDs, while Samsung own SRP is $330

  3. <shakes head>
    FAIL

    3% of price does not convert to 97% more expencive

    3% of the cost is converted to 33times more expensive or 3300%

    either me or the Author has failed

  4. LarryF

    Apples and Oranges

    Agree with AC, this is a dubious comparison. Any reasonable enterprise server with a dozen SSDs and loaded up with DRAM will show "amazing" SPC-1 numbers. As I recall, SPC was created primarily to make a realistic comparison of performance metrics between enterprise storage arrays that could scale out to hundreds of TBs.

    Larry@NetApp

  5. Anonymous Coward
    Anonymous Coward

    Nonsense...check your facts and stop whining.

    SPC-1 measures the performance, latency, and cost/performance of storage, whether it is "Software Defined Storage" or "Hardware Defined Storage".

    This is not the first time SPC-1 has been used to test the performance of an SDS stack, there is plenty of precedent for this, going all the way back to 2002.

    For one example, here is a DataCore result from 2003:

    http://www.storageperformance.org/results/a00015_DataCore_SPC1_executive_summary_a.pdf

    If I recall correctly, they blew everybody away then, also...and I DO recall correctly:

    http://www.businesswire.com/news/home/20030813005562/en/DataCore-Shatters-PricePerformance-Record-Storage

    Other examples include IBM (SVC aka "Spectrum Virtualize SDS") and I'm sure many others...

    Kudos to DataCore for (once again) being the first SDS vendor to subject their software stack to this most rigorous of all storage benchmarks, and for blowing away all the hardware-defined competition...

    In the midst of the SDS and Hyperconverged hype cycle we are in, the real question is why are no other SDS vendors stepping up to the plate?

    1. Anonymous Coward
      Anonymous Coward

      Re: Nonsense...check your facts and stop whining.

      Yes but IBM SVC was a fully redundant solution that virtualized multiple external arrays not a single server with local SSD that just happened to be running a software stack.

      Most SPC-1 results are gamed in some way by the vendors, just look at EMC's recent VNX result, but this isn't even in the same class. The whole idea of datacore is adding high availability, shared access and data services to commodity servers and dumb storage and yet in this test it did none of those, so what value is Datacore adding here ?

      If you were attempting to run a single system as is the case here, (which isn't comparable to a SAN or even SDS, more like a local volume manager), then why even bother with Datacore just go server with internal DAS storage. Datacore doesn't seem to have added anything here, ultimately they're just leveraging the windows I/O stack anyway.

  6. dikrek
    Boffin

    It helps if one understands how SPC-1 works

    Hi all, Dimitris from NetApp here.

    The "hot" data in SPC-1 is about 7% of the total capacity used. Having a lot of cache really helps if a tiny amount of capacity is used.

    Ideally, a large enough data set needs to be used to make this realistic. Problem is, this isn't really enforced...

    In addition, this was a SINGLE server. There's no true controller failover. Making something able to fail over under high load is one of the hardest things in storage land. A single server with a ton of RAM performing fast is not really hard to do. No special software needed. A vanilla OS plus really fast SSD and lots of RAM is all you need.

    Failover implies some sort of nonvolatile write cache mirrored to other nodes, and THAT is what takes a lot of the potential performance away from enterprise arrays. The tradeoff is reliability.

    For some instruction on how to interpret SPC-1 numbers, check here:

    http://recoverymonkey.org/2015/04/22/netapp-posts-spc-1-top-ten-performance-results-for-its-high-end-systems-tier-1-meets-high-functionality-and-high-performance/

    http://recoverymonkey.org/2015/01/27/netapp-posts-top-ten-spc-1-price-performance-results-for-the-new-ef560-all-flash-array/

    Ignore the pro-NetApp message if you like, but there's actual math behind all this. You can use the math to do your own comparisons.

    But for me, the fact there's no controller failover makes this not really comparable to any other result in the list.

    Thx

    D

  7. Anonymous Coward
    Anonymous Coward

    Where are they getting the pricing? Is this just MSRP? If so, then this is meaningless.

  8. CheesyTheClown

    Too little too late

    I've been testing DataCore and NetApp and quite a few others for storage. The first and most important thing I learned is that the best way to decrease storage costs is to just stop using SCSI which means there's a real need to get away from VMware (as I struggle to get a VMware data center up right now).

    SCSI IS NOT a network protocol and it's really really really bad at it as well. There are 9-12 full protocol decodes/encodes and translations between a IO request from a virtual machine and the storage which adds a ridiculous amount of latency. Also there is an insane amount of overhead in block based SCSI for handling small reads/writes which are a fact of life since developers in general tend to use language provided streaming classes/functions for file I/O.

    So, that brings us to NFS and SMB. NFS is okish... it's a protocol which really has far too much legacy and way too much gunk in it as it tries to be everything to everyone. At the same time, all these years later, there's still no standard for handling operations like VAAI NAS as part of NFS which is just plain silly since NFS is an RPC based protocol and things like remote procedure calling should be first nature to the protocol. As a result, using NFS is just out of the question for daily virtualization using VMware since those guys make it impossible for anyone other than companies willing to spend $5000 and sign contracts to get a hold of their API for VAAI NAS which is just stupid. As a result, for VAAI NAS with Linux storage servers, I had to install the Oracle VAAI NAS driver and override their certificates and decode their REST API and implement it on Node.JS to make it tolerable.

    Then there's SMB v3 which is a near re-write from the ground up for just virtualization storage. To use it, you need Hyper-V which won't have nested hypervisor support until the next release which is something I'm personally extremely dependent on.

    So, performance-wise... DataCore is SCSI and their management system has all kinds of odd bugs and quirks and is damn near impossible to implement properly in an application-centric data center. There just really isn't much value in their products other than acting as a FC boot server for blades which don't like iSCSI (think HP, Cisco, etc...)

    NetApp has terrible performance. Because of the overwhelming sheer stupidity of using block protocols, the NetApp has no idea what it's filing and dies a slow painful death in hash calculation hell. Heaven forbid you have two blocks which are accessed often which have a hash collision. It'll suck up nearly the entire system. Let's talk controllers and ports. NetApp controllers and ports cost so much there's no point even talking about them. Then there's the half based API for PowerShell, barely functional REST API, disasterous support for System Center and/or vCloud Orchestrator. Add the OnTap web gui which is so bad there's no point in even trying to run it... which generally you can't because their installer can't even setup the proper icon and it's generally blocked by the JVM anyway.

    I have a nifty saying about NetApp and DataCore... if I wrote code that bad, I'd be unemployed.

    These days, there are a lot of options for storage... too bad most of it's not that good. I'm moving almost entirely to Windows Storage Server with SoFS and Replica because I'm able to get a fairly stable 2-3 MIOPS per storage cluster and I've been building that on $500 used servers with an additional 8 ten-gig ports and consumer SSD drives and NAS drives.

    1. dikrek
      Boffin

      Re: Too little too late

      @Crusty - your post was hilarious. Especially this nugget:

      " the NetApp has no idea what it's filing and dies a slow painful death in hash calculation hell. Heaven forbid you have two blocks which are accessed often which have a hash collision"

      That's right, that's _exactly_ the reason NetApp gear is chosen for Exabyte-class deployments. All the hash collisions help tremendously with large scale installations... :)

      http://recoverymonkey.org/2015/10/01/proper-testing-vs-real-world-testing/

      Thx

      D

  9. Anonymous Coward
    Anonymous Coward

    I think you guys are missing some key points

    After researching this a little more the first thing that jumps out is this was a hyper converged run at the test (specifically, the SPC1 test was running on the same hardware that was serving the storage). This would be like saying to EMC or NetApp, let me run my SQL or Oracle workload directly in the VNX/VMAX or FAS storage controller.

    Second, while this was a single node, it was an internally mirrored configuration. Also, there is nothing stopping you from deploying more nodes for high availability. From what I can find about Datacore, you can run 64 nodes together in a single group.

    Third, the biggest thing that stands out to me on this result is the latency. Latency is everything. I stopped worrying about IOps a long time ago. I worked for many years in a very large and demanding VMware based environment and latency directly translates into user experience. The IOps are impressive here, but for multitenant environments where the conditions change radically, the latency matters. I'm not an expert on SPC1, but from what I can tell this is a brutally challenging benchmark. If you can post good numbers on this then that is saying something.

    Lastly, regarding the comment on their stack, I would be careful about making comments on that without knowing what they are actually doing in the Windows stack. I recommend calling them up and asking, otherwise this is just a bunch of FUD.

    Well done Datacore! I am very interested in seeing what comes next. It also would be interesting to see some other players in the SDS space post numbers for comparison.

    1. Anonymous Coward
      Anonymous Coward

      Re: I think you guys are missing some key points

      Yes but your ignoring the fact that a single node has no overheads for HA such as cache coherency, mirroring etc. So a result for a single server doesn't necessarily translate linearly to multiple nodes. Besides the use of raid 10 doesn't really provide much insurance If you only have a single node in the first place.

      1. Anonymous Coward
        Anonymous Coward

        Re: I think you guys are missing some key points

        Especially when that one node only a has a single boot disc

        1. Anonymous Coward
          Anonymous Coward

          Re: I think you guys are missing some key points

          If Datacore can get an SPC-1 run on a two node config then I think we'll be singing from the same hymn sheet i.e. HA, resilience etc. That I would love to see.

  10. ntevanza

    Icons needed

    We need a Crusty the Clown or an Eeyore icon, just for poor Cheesy. Don't worry if you can't do one, he's disappointed already.

  11. Guidom

    We use DataCore in production over 6 years now (stretched cluster of mirrored raid10 disk pools). However we use FC instead of ISCSI and can easily do 200k iops in a single VMware vm with neglectable latency. We do use semi enterprise ssd's in the DataCore nodes which are however (Samsung DC Pro) but still it's way cheaper than any hardware san out there and due to shared nothing stretched cluster I would say it's an "enterprise" san. What's missing however is dedupe, but maybe even more important compression.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like