Some of us have been saying that for years
Unlike Gartner, who change their mind about it every six months or so.
Flash chip bits cost eight times more than spinning rust and SSDs aren't going to get cheap enough to kill off disk entirely. The cost/TB of flash storage has been dropping for several years as chip capacity has increased through making smaller cells then layering cells with 3D NAND. It means you can get more chips from a NAND …
Hi all, Dimitris from HPE Nimble here (recoverymonkey.org).
Disk is indeed here to stay, but this doesn't mean ignoring AFAs is wise. What Infinidat is doing is akin to the saying "to the man with a hammer, everything looks like a nail". We get it - they have a decent hybrid.
Nimble (and others) have incredibly efficient and scalable hybrid systems, but also we have AFAs for the people that can't tolerate any latency variations due to the data not being in cache. Because 90% cache hit rate means those 10% of I/Os have to come from 7,200 RPM disks, with the latency that implies.
We can replicate from AFAs to hybrid to save costs, users aren't required to have all the same type of storage this way. Or have mixed hybrid + AFA clusters, with non-disruptive data movement between cluster members.
And with the Nimble ability to inline dedupe and compress for hybrid and AFAs (dedupe is now available for hybrid with Nimble OS 5), the price delta between hybrid and AFA becomes wider, not narrower. Yes, disk is here to stay.
And future very large capacity disks (both spinning and SSD) is one of the reasons Nimble RAID is Triple+ (5x more resilient than plain triple parity, and many orders of magnitude more resilient than other RAID types).
Still, I admit Nimble made a mistake to delay coming to market with an AFA. Our rationale was that the hybrid was SO GOOD that people didn't need an AFA. Sound familiar? :)
Yes, when one can cluster together 4 Nimble hybrid engines (each with 2 controllers) and enjoy huge capacity and performance in a single pool, it's easy to think an AFA is not needed. After all, that configuration could be set up with 400TB of cache, surely that's enough? :)
Ultimately, it was a mistake and the stock market punished us heavily for it.
I've presented in the same room as Brian from Infinidat, his entire thesis was that AFAs are unecessary.
If we are having these arguments we are not discussing things customers actually care about.
Thx
D
This type of comparison is too shallow. SSDs are so much faster that you can compress data in real time, reducing the cost/TB by around 5X. I suppose a tiered SSD/HDD hybrid might get that advantage, but then we need to factor the reduced number of appliance boxes we need with 2.5" SSDs or the new ruler drives, versus what will always be 3.5" HDDs.. I figure this gives SSDs a reduction $.10/GB.
Next, the massive performance difference reduces server count dramatically, perhaps by 3x on average. That yields another $.20/GB and operating cost savings will cut $.10/GB or more.
As 3D NAND ramps in 2018, with QLC coming to the bulk secondary storage market we are comparing, the numbers are a lot closer, especially with the new fabs Samsung and others are opening. HDD capacity is stuck till HAMR comes at end 2019 and HAMR will increase $/GB.
Bottom line is the compairon is loaded to support HDDs.
Remember there is a lot of data out there that is already compressed. Seems a lot of the world's new data by volume is in the form of video, pictures etc. I think a couple of years ago I read youtube was adding 1 or more petabytes of storage per day. Pretty safe to bet that is not on SSD.
Deduping such stuff is not easy too, I mean sure you can dedupe identical blocks but there will be tons of versions of the same song or video or whatever that will be different enough that they won't dedupe (not without much more sophisticated content aware dedupe, I think the likes of the music upload matching services try to do this and from what I've read it's far from perfect?).
I also think internally to my own Splunk instances, data there is already compressed as well, and obviously won't dedupe for crap either.
Don't forget tape too - capacity wise year over year I keep seeing it is shipping more and more. Myself I have been trying to get budget for a tape library +LTFS+NFS gateway (no fancy backup software just shell scripts, similar to what I use today with HP StoreOnce NFS) for offline backups for a couple of years now, maybe this year will be the year.
Exactly my thoughts and experience, compression and dedup on all our data is very little. Also how is the speed of the SSDs going to help with compression, will help with dedup, for the dedup tables that will not fit in RAM but has nothing to do with compression. Compression works fine with spinning disk (even more so as the latency added with inline is so little compared to its overall latency) so that is removed from the price comparison that was made.
Because hard drives are slower, compressing the data / deduplication is easier when using them versus SSDs - the fewer MB/sec passing through the less computation there is to do. Though in a nearline environment there the number of I/Os is very small - it isn't "nearline" if drives are serving multiple megabytes per second.
At some point the per TB cost difference will become small enough that power costs will dominate. In a nearline environment you hope to keep most drives spun down to save power. Spinning up a drive, performing the I/O then spinning down will end up using a lot more power than sending that request to a sleeping SSD that can be woke up in a millisecond or two and immediately go back to sleep.
For storage, cost per GB matters. $/IOP is not the correct figure. Servers use IOPS... storage stores GBs.... that is why there is no penetration of SSDs in Nearline.
Yes it is golf cart vs BMW... and no one has a BMW on the golf course because that would be stupid.
and its Honda vs BMW... and Honda outsells BMW by a large margin because most people won't ever pay for a BMW that isnt needed or worth the money.
As long as any HDD tech maintains a lower $/GB price tag, then it'll still see use over SSD for performance-agnostic workloads.
And no, the power cost difference won't matter as long as the difference in up-front $/GB remains over 20:1 or so - in other words, SSD prices would need to fall to more than 5 times lower than predicted for 2021.
One thing the article doesn't mention is once you get to these low costs per GB for the hardware increasingly Wh/TB becomes another important component over the life of the disk.
Spinning rust hasn't made much progress there in the past few years with most gains just coming from moderately larger platters. In the end it relies on a motor spinning a mass and a servo moving an arm. There is some small potential for efficiency gains but that's it. Whereas for NAND both process size shrinks and new substrates seem to have pushed declining power needs at a much faster pace.
Since Wh/TB is also a proxy for density it should then give you a potential cost advantage both in terms of power consumption per TB but also operating cost for cooling and floor space in the data center. Once we get to that low cost zone Gartner describes the percentage of those operating costs vs hardware acquisition cost as part of TCO goes up... I wonder whether anyone has done projections on that? Obviously the degree of impact should be location dependent based on power, land and cooling costs...
I'm no storage expert but have been following and doing storage things for about 12 years now.. what is Wh/TB ?
Samsung as a TB-WH thin client but I'd wager that is not it. Web searches on bing and google do not turn up any references to this string of characters that I can find.
Do you mean Watts/TB ?
With the cost of flash so high, I'd imagine it's going to be a rare situation where the savings of power of flash is going to be more than the savings of the storage medium itself in bulk storage situations.
Now watts per IOP..certainly ssds rip hard disks apart there. No more needing to have racks and racks of short stroked 10 or 15k rpm disks obviously..
Wh/TB = Watt x hours/TB. It's the total amount of work required per terabyte of stored information for any given period of time. Pretty much a proxy for TCO for a particular amount stored after acquisition. Now figuring that out for Infinidat's gear would be an interesting data point as well.
Personally, my heart lies with Infinidat's approach even if in the real world I do not get to make the choice of gear required. Not always, anyway.
I have many interesting conversations with non tech people about storage costs, "I can go to PC World and buy a 1TB disk for £30 why dou you charge me £4200 pper year?"
Good question. Weel there is teh primary storage, that cost £1500, then there is backups, that is £600, then there is the DR and the offsite copy of teh backups which doubles it.
"But why does it cost £1500 for 1TB of primary?"
Ah, software and all teh controllers and stuff to move all the bits around......Software to manage it......
Cost per GB of individual disks is quite different, but no one buying storage buys a disk, they buy a storage system and the cost of flash become a small componetnt of the overall cost. Software, replication funtionality, support costs etc all make the small difference in cost of SSD around 10-20% of teh total, which tips the balance in favour of SSD given it responsivenes and generally speed improvements.
Storage is due another disruptor, Nutanix is trying its hardest and Pure, Tintri and Viloin (amongst others) have all had go with varying degrees of success and impending doom, but it is all been different form factors which the big boys don't feel comfortable with and dont work for every scenario (unstructured data anyone?). Ultimately it needs something to break the monopoly of the big boys (EMC, HP, IBM, Netapp, Hitachi) with their "value added/should be standard" extras which may actually see spinning disk regain ground on a cost basis as other components are comoditised included as standard, but no one has really gone after them in their own space with a proper purpose built SSD based system against their stick some SSDs in an old Clariion mentallity.
Agree with you that the difference between disk sizes and costs from just a media standpoint can be a tricky discussion, and even with those that are "educated" sometimes we still suffer from the "can't see the forest for all of the trees" issues.
However, "disruption" has happened, not sure if I would include Nutanix in that mix, not that they aren't "disrupting" things, but their storage effort is a consequence of another disruption that they (pretty much) started which is HCI. The point being that no one is purchasing Nutanix, Simplivity, etc. for storage "ONLY" solutions. Tintri, and Violin also didn't really "disrupt" storage, they just offered it in neat, or "different" packages. Nimble, Pure, even Infinidat truly have achieved disruption from the standpoint that they moved away from traditional RAID, relied on compute for performance, and include enterprise reliability on commodity hardware. I would also note that most of the "true" disrupters include all of those "value added/enterprise features" as part of the initial product, no more licenses to buy, features to obtain, etc.
Suffice it to say, if a storage platform is still requiring RAID 5,6,10, Hot Spares, tiering, upgrade downtime, performance degradation from data reduction or snapshots, or replication outside of the storage platform...it is still legacy storage and not "disrupting" anything but your checkbook!
Today's data center requires more than just storage, compute, and features. It requires intelligence that allows administrators and managers to do their jobs outside of those traditional technology silos and consume, administrate, support, and tune the entire technology stack from minimal/single GUI's. Tools like HPE Nimble and InfoSight give the extra legs to an already disruptive storage experience to truly allow customers to leverage not only the performance, but also the stability of a next generation data center!
Has anyone assessed the reliability improvements in solid state drives?
It used to be that repeated accesses to the same sector caused rapid degradation, measured in weeks of heavy use. Firmware algorithms were developed to relocate sectors to spread the wear across the device.
So when will SSD reliability overtake spinning drive reliability?
"Has anyone assessed the reliability improvements in solid state drives?"
Firstly, the vast majority of data access is write once read mostly (or even read occasionally), with some bursts of "read many")
SSDs faced with this kind of usage pattern do just fine thanks - MUCH better than HDDs do.
This means the driving factors for SSDs moving into the bulk storage market become longevity (reliability) and power consumption (Wh/TB as one commenter put it).
If I can stick my big arrays into MAID mode with a sub 5 minute "spindown" time, know that when poked they'll respond in less than 3 seconds, be immune to rack vibrration - AND live a decade or more, then they're going to cost me a lot less than spinning rust over the same period just in drive replacements and probably 1/10 as much in power consumption.
This is the part that Gartner failed to take into account.
"When taking IOPS into account the price per GB goes up massively with spinning rust."
IOPS are not relevant in all storage cases, just as $-per-GB is not relevant in all storage cases. You don't put your most intensive IO apps on spinning rust, but you also don't put archival workloads on SSD unless you are criminally bad at budget management.
Some spinning rust is dead. There is no reason to waste money on 10k and 15k drives now, unless a vendor is just giving them away. Slow spinning capacity drives like 5.4 or 7.2k drives that have 8+ TB are going to be around for a while longer. We just need a new nand chip manufactuer to saturate the market to change this though. We've been waiting for the 60-100TB ssd drives to land for a couple of years now, and its still just vaporware. Once these capacities of SSD hit the market for cheap, then and only then will spinning rust finally die.