
Surely these kind of capacities will leave spinning rust a thing of the past?
Toshiba has introduced its first 64-layer 3D NAND device that doubles the capacity of its 256Gb product to 512Gb using a TLC (3bits/cell) design. That means a die holds 64GB. Sample shipping started this month and mass production is slated for the second half of the year. Both enterprise and consumer SSD products are expected …
It wouldn't have to be comparable, just low enough that its premium can be justified to the buyer since solid-state drives do provide tremendous benefits. It's just that the premium at this point is still too high for most. I would say once it gets to double (or less) the price/capacity of rust, especially at large capacities (pretty much rust's last stand), then the sun will set for rust.
Except I would think the price point for 30TB of ANY storage is going to probably limit it (at least for the short term) to enterprises who can actually afford them. After all, 6-8TB of rust runs about $200 externally, and these are likely shingled so are best for read-heavy jobs (a niche tech like QLC would be able to fill). So if they can do, say, 10TB for about $500 using more-general tech that lasts longer, then they'll be in a position to assault rust from the capacity end of the spectrum.
Though I should note that by that point, it would also be nice to have more-affordable access to some kind of longer-term backup tech on the consumer front, since at those capacities transfer glitches are more likely to crop up.
Sure, they are faster (especially for random access) but that only matters where performance is a criteria. If it is storing less used data, or for backups, you generally don't care much about performance and thus won't pay to increase it. There is a ton of storage sold for these purposes, increasing every year.
Power consumption matters, but the few watts required to keep rust spinning doesn't pay for much (about $20 total over a five year lifetime at 10 cents / kwh) and if you can let them spin down even that advantage is lost.
Reliability is a wash, both typically die due to controller failure which is equally likely between the two.
So no, hard drives won't die until SSDs can match them on price per bit, period.
It of course depends on the use case.
In laptops any speed increase is a bonus (which in price terms may more than pay for any difference), every miserly power consumption increase is a bonus., and reliability gives way to endurance - SSD's are much better at surviving being dropped off tables than spinning disks.
I just see this as a potential drawing level point in cost terms for a lot of use cases.
I think we already passed the point where SSDs make more sense than hard drives in laptops about five years ago, personally.
The question wasn't whether replacing the single hard drive in a PC or laptop makes sense, but whether ALL hard drives would go away just because of the speed advantage. All those hard drives at Facebook, Amazon and Apple storing cold data won't be replaced by SSDs until they are cheaper per bit, because no one cares if it takes a few seconds spinup delay to open that photo from 2007 that no one has looked at in eight years.
This post has been deleted by its author
Our current protection and recovery technologies are geared for (maybe) a max of 4TB for device.
One little doodad on this large a drive fails and potentially 30TB goes pfffft. Even with an effective RAID or Erasure Codes, that's a helluva big failure domain / long time to recover.
That's not how erasure coding works. This is the issue erasure coding is designed to fix, and it does fix it. RAID makes a set of disks, erasure codes encode pieces of data. The failure domain, therefore, is whatever size the blocks being EC'd are. EC is also designed to allow scaleable resilience so you can allow for 30 drive failures if you really want to.
With erasure coding you have smaller units of protection but when a whole device fails you have multiple units compromised. While it offers advantages over traditional RAID it doesn't stop the problem that when a multi-TB device fails there is a lot data that needs to get back to full protection. Frankly anyone using RAID5 or RAID10 (tolerates only a single failure) with large devices is asking to not get their data back and even RAID 6 looks like a high risk of lose with large data sets/devices. Large devices and large data sets these days should be able to withstand 3 storage device failures without lose of data. Most storage vendors don't offer this, IMO they should. Come on Dell, HP, IBM, HDS, Pure et al, time to up your game. Rule #1 of storage, don't lose the data. It should be in every storage RFI/RFP.
Erasure coding has its place for large devices because larger transfers (inherent with larger disks) raise the risk of glitches: silent corruptions like double-bit-flips that manage to still pass on-the-fly checks like parity checking. With erasure codes in place, you can correct for those glitches.
Now, for whole device (ie. controller) failures, yes you need redundancy, but also recall that reconstruction is a function of time, and one thing SSDs have in spades over rust is transfer rate, especially when using 4x PCI Express. This greatly reduces the reconstruction time which in turn reduces the risk of a failure during the vulnerable reconstruction phase. Perhaps because of these faster times you can get away with just 2 backups when you would've needed 3 with rust. Besides, at some point you have to think enough is enough because if you get a major event that nails say FOUR of your devices at once (AND maybe even all your backups, including the offsite, think a major earthquake) you're into Act of God (aka Crap Happens) territory when all you can do is pray.
That's why I use BOTH strategies, though in a smaller capacity (because the data I'm backing up is less critical): two copies of each complete with PAR2 sets. The PAR2 files provide erasure codes to deal with glitches, while having the second copy (normally kept offline to reduce wear, and the two are rotated periodically) provides a failsafe in case one goes kaput.
@AC the issue with RAID isn't whole device failure it's with probability of recovery. Following one drive failure on RAID5 you'd be unable to recover the set if you get a single non recoverable error on another drive. You lose all data in this scenario and need to go to backups. Given current NRE rates, the chances are better that you'll lose everything on a 10TB volume than it recovering properly with no further read errors.
RAID 6 is less risky because it uses different striping patterns usually as well as having more parity. It's still true though that after two drive failures a single dead block can kill the whole volume. Current error rates give good odds to about 25TB for RAID 6, although why risk it?
RAID 10 will silently copy the error blocks, causing spreading corruption as more errors are cloned when disks fail. Yes I am overstating this issue, but it is an issue. The OS can often recover or fail at the file level though so RAID 10 will rarely lose a volume.
EC will fail the affected parts of data related to the error block, meaning you only need to recover one file from backup, not multi-TB file systems. Because the error would be flagged you could even identify said file and pro-actively repair the file system. Yes, the whole system would need to rebuild after a drive failure, but the probability is high that you'll get most of your data back.