Breaking news - hold the front page
Salesman wishes to sell you the most expensive option with little regard for the practicalities of your situation.
More at 11.
What's EMC's attitude to tape? Who better to ask than William "BJ" Jenkins, the bigwig running the storage leviathan's backup and recovery systems business. So we did. Last year EMC said loud and proud that digital tape sucks, and featured the world's biggest ball of tape at marketing events. This year there is no such …
If you think that modern LTO tapes are like the C15s that you used on your Trash80, then you've got no right to comment on this story.
Massive tape libraries with well designed data management systems are fine for backing up or archiving large amounts of data. The only problem is that enterprise grade tape media is still too expensive (but still cheaper TB for TB than disk). Put in a data-management system with recent data stored in modest sized disk pools, and migrated to the massive tape pools as it ages. Index the data so that you know which tape it is on, and you can retrieve it remarkably rapidly, and with relatively little effort.
And if the data falls in a category where it is no longer required to be accessed quickly, you can actually remove the tapes from the library to make space for more. And you could easily store a replica of your data in an offsite store in case of disaster (try seeing what the cost of having Petabytes of data in 'the cloud' is).
No, tape is still useful. Just be careful that you keep the drives to read it working!
> Cheaper TB for TB? Really?
DLT-S4 cartridge, 800Gb (1.6Tb compressed), £40
2Tb disk drive, £80 at inflated post-Thai-flood price. £50 expected again soon-ish. 4Tb drives on the market but so far at "premium" prices.
I make that a tie, depending on whether your data is already compressed or hugely inefficient. Now, if the argument was whether you'd be able to read it after twenty years' offline storage, tape might make the better case. Certainly, the results of the accelerated ageing tests for tape don't need to be taken with quite as much salt. But if the data really matters, you'll probably be reading or even re-copying every couple of years to be on the safe side of entropy.
A tie? If you use more current tape technology than DLT-S4 (6 years old) such as LTO5 (a couple of years old) then the figures look a lot better.
LTO5 (1.5TB worst case) = $70 or $46/TB
2TB SATA=$120 or $60/TB (though anyone using consumer grade HDD for long term data retention needs a slap)
You can also shove tape in the back of a cupboard and access it a few years later. If you do the same with a hard drive then the chances of it not spinning up properly are higher than I'd be comfortable with.
Tapes are always cheaper, due to:
- no electricity (except minimum environmental control) to keep the info
- lower price per TB
- no mechanical problem on tapes, thus no support needed for the media
- software can accomodate tape aging (TSM does this, some archiving tools as well) at a very lower price (the one of the SW) than disk aging is taken care of
This will remain the case until passive memory techno kills tapes for good (5 years time ?).
Of course, the likes of Netapp and EMC are fighting this to keep their margin, but this only works with clueless CIOs. If there was any valid point in this, tape market would have collapsed years ago.
...when machines were purpose built and not just for the masses. Whole floors of computing devices just for specific number crunching. Things were invented because there was a need for it. Now we rely on hardware made for the masses. Storage devices with the same failure rate as the one built in a house hold PC.
Is it too much to ask for a storage device built specifically for storing large amount of data for a long time? Big companies should invest in developing new technologies instead of spending it on old-fashioned and failing ones.
So, for long term storage your choice is:
Optical or Magneto Optical Disk
Magnetic disk expensive
Magnetic disk cheapo
Tape uses the least power, the cartridges cost the least, optical is on the way out, expensive good disk is just that and cheapo disk may fail too much. Disk spins all the time and thus needs cooling and power all of the time. Tape doesn't.
You could use something like a Centera CAS array, if you need fast access to your archive, these use cheapo disk arranged in such a way that you need about three or four disks to fail in your primary installation before you can loose data. You also replicate them. However they're fairly costly to run in terms of power and cooling.
If you need large long term, medium to slow recovery speed, it's got to be tape.
How else do you suggest the data are stored? Tape is specifically designed for long term storage. There is nothing else even on the horizon.
No reason to keep your archive disks powered and spinning. If access is infrequent, the system should spin them down between accesses. If even less frequent, they could be unplugged completely (robotic hot-swap SATA storage? Or just some custom electronics to cut the power completely on drives that won't be accessed again for weeks.
Arguments have raged for decades on whether drives last longer spinning or powered down. I don't expect a resolution any time soon. Whenever someone has a statistically valid answer, the drives for which it's an answer are many years obsolete.
While I accept that a short term backup solution can spin up and down disks, the sort of system which is
designed to be a backup of, say a month's worth of backups prior to moving long term stuff onto tape. I would not trust a disk to be off for a year or more and expect it to come back and we are talking about backups or archival data which may be retained for years or decades. Disks also cost more than tapes, there is no reason to have long term storage on something with as inefficient data density and cost as a disk.
> I would not trust a disk to be off for a year or more and expect it to come back
Why distrust that, any more than you distrust a disk that's powered up, wearing out some mechanical or electronic component with an in-service life considerably less than its shelf life?
Anyway, for archive you are surely using RAID techniques with multiply redundant disks, so a single failure or even a double failure won't hurt.
"New" disks may have been in the pipeline from manufacturer's testing to your installation for many months. True, new drives fail rather more often in their first few weeks and occasionally are even DOA. Most, though, are just fine. So as long as you have RAID with two or more redundant disks you should be OK.
Because when a disk is spun up, you notice if it's failed. When a disk is on a shelf, there is no notice, unless you spin them up every so often to check them, in which case you rapidly start to lessen and possible advantage. Also, like I said before: Data density on tape is far greater than data density on disk and the is far less to go wrong. Cost per byte of tape is far lower also.
Disk is inherently a short term technology which isn't suited to long term archiving.
> Because when a disk is spun up, you notice if it's failed. When a disk is on a shelf, there is no notice, unless you spin them up every so often to check them, in which case you rapidly start to lessen and possible advantage.
I'm not convinced. Many office PCs are powered up and down daily. Many are configured to save electricity by sleeping with the disk spun down, several times per day. They're acceptably reliable. I look after ~100 such and see 2-3 HD failures per annum. With about half there's advance warning of incipient failure (SMART error counts) .
For an archive, the disks would be in RAID sets (multiple partity disks, at least two). Have the automation spin the disk up once a week if that hasn't happened by normal operations. Fail? same as any other failure in a RAID system. (Automatically) replace the failed disk, reconstruct, set bright red failed light on failed disk canister so dumb tired human knows which drive he's supposed to unplug and toss. Disk life expectancy would be close to shelf life, if they were powered up only for a couple of hours, one day in seven. MTBF ten years? Close to office PC disk life, if active two hours every day and spun up and down a few times per day. Certainly no worse than three years.
Main problem I see with a big disk archive is the running cost of replacing failed disk drives. There again, how long are tapes safe before you are advised to copy them to new media? Cost per Gbyte is pretty much the same. In a few years time, will tape have caught up with 10Tb disks? And if BPM/ HAMR tech gets out of the labs, we'll probably have 100Tb disks by 2022.
Anyone know why they can't / don't do write-once optical tape? Holes burned in a stable dye layer between polymer films? 2400 feet of 1-inch tape is the area of about 1500 DVD-Rs, or about 7Tb at the same data density.
BTW I have no trouble reading CD-R's burned over a decade ago, that have been kept in sleeves in the dark at office temperature. Ditto DVD-R going back slightly less time. People who say they don't last, are probably letting sunlight get to them.
You would need a base reflective layer (a la CD/DVD etc ) I would imagine that there are problems with lamination between any reflective and dye layers when it's stressed by unraveling and re-raveling it. This would also stand, if for some reason you could do without the reflective layer.
Also, this would require another type of drive, the current worm tapes don't.
As for CD-Rs, yes, I have some of about the same age that I can read, a also have many I can't, which is the more important finding. As the head of backup for a large financial company, I spent lots of time making sure that end users didn't have access writeable optical drives. A 10p writeable disk on a user's desk is no comparison for a £20 tape in a locked robotic library inside a secure facility. This is in terms of, reliability, security, speed of access, longevity, pretty much any metric you could want to measure it by.
Yes, it's very important to match the reliability of the storage to the importance of being able to retrieve it down the line. My CD archive is of physics data most of which will never be read again, and of which a statistical sampling (say 80% of the files) will be almost as good as 100%.
Wonder if anyone would be interested in software to create RAID-6 sets of DVDs? (Not that I have time to write it, but someone might like to run with the idea, especially if their data archiving needs are a few tens of Gb per run).
No, you don't need the number of physical drives in the RAID set - you need an program to simply make multiple write passes over the source data set, creating one volume of the RAID set in each pass. If you have a lot of working storage (i.e., disk) then you could actually create all the RAID datasets in one pass of the source data, creating temporary RAID volumes in working storage that can then be sequentially written to your optical drive. This is almost COMP SCI 101 stuff...not that hard to figure out.
>Wonder if anyone would be interested in software to create RAID-6 sets of DVDs?
CDs and DVDs are the line that I have yet to cross... I created a RAID 1 from four floppy drives once, though; talk about speedy! I forget the actual data transfer, but it was around 3.5 times what a normal floppy drive could do. I've done a software RAID 10 on about six flash drives, too, but the overhead was enough to offset most of the gain.
> You would need a base reflective layer (a la CD/DVD etc ) (on write-once optical tape)
Not necessarily, you could put the read sensor on the other side of the tape and shine the laser through it. The tape is linear so the heads won't be moving, there won't be any alignment issues.
"Is it too much to ask for a storage device built specifically for storing large amount of data for a long time?"
Here's a wacky idea: continuous-roll microfilm! It's got plenty of resolution (http://www.kenrockwell.com/tech/film-resolution.htm), and processed film lasts for decades.
Just make sure the teacher doesn't thread the projector wrong -- *snap* *flap-flap-flap-flap*.
Thanks for the many replies but I feel you are missing my point (boy am I getting down voted for saying that!)
One day a solar flare WILL hit us hard enough to erase magnetically stored data. A quantum computer generated virus WILL disrupt and destroy precious data.
There have been floods, earth quakes and fires but ancient data still exists in the form of hieroglyphics and wall paintings. Granted we have a hard time deciphering some of it but it is still there. Why can't we follow in their footsteps?
Here are some ideas I have for data storage and recovery:
1. Light/radio/waves; Send data in packets in different light/radio frequencies out into space. Packets are picked-up by a station somewhere in space and send back to a relay and re-transmitted to said station. After the signal has been send it is no longer necessary to store data locally (it is traveling trough space) so local storage on both ends can be kept to a minimum.
2. Etching; Photosensitive etching paper made of non-corrosive materials and is of a WORM type (Write once, read multiple). Etching can be done with current printing technologies. Yes, I mean like punch cards...
3. Holographic storage; I can already hear you sighing 'No more!'. That's okay, but like I said, big companies have to get together and invest into this new technology that WILL eventually work. It is just a matter of time. It's the electric car vs the petrol car all over again. *sigh*
4. Whatever the dreamers can come up with.
> Light/radio/waves (in space)
I see where you're going, but radio and light in space are far too easy to destroy. A fraction of a degree off on one of the antennas and you're not pointing at the same planet let alone another tiny antenna.
Light is even worse, a stray rock drifts into the beam and your data is toast.
> Whatever the dreamers can come up with
How about storing data at the molecular level in a man-made lump of granite? It's been done with organic cells already.
Pretty naive summary of why tape is cheaper. Did I say naive? Actually I meant to say piss-poor.
Tape prices are no longer so compelling when you do the price per MB because:
- Tapes require more management , hence people time
- Tapes require more hardware maintenance
- Tapes can actually store substencially less than disk because a whole tape is often not fully utilised and you need to keep duplicating data on tape where you dont need to on disk (unless you want ridiculous restore times).
- Tape backups are less reliable
Now, I'm not writing tape off. The fact its cold storage is still a big deal although spin-down disk system can provide similar benefits. Where you have long retentions and massive data sets, tape is still the cheapest way of getting offsite backups.
10 years ago there was a 100x-200x price difference between disk and tape medium. Now they are darn close.
So tapes are difficult to use.
Apart from the obvious downside, there is an advantage that it's also not easy to accidentally wipe backups from tapes (physically) from within an application/command line.
Ok, perhaps drying the water-damaged tapes in the micowave wasn't a good idea
Yep, EMC have always been like this with tapes. I was told to my space in 2000 from a high up EMC guy that "Tape is dead". So I asked him how he gets back something that was lost a year ago, long outside the retention of their RAID snapshots (BCVs or whatever they called them) and he mumbled something about using tapes lol.
One thing is certain though, is tape is no longer the primary backup device. Disk has overtaken tape everywhere except in super high volume or legacy setups. But of course you still need some offsite backups. Disk can be used for this too but it really depends on your change rate. Disk to Disk over WAN works if your change rate isnt huge. Removable disk which are robust (like RDX) are only suitable for small volumes. This leaves you with good old tape.
"Disk to Disk over WAN works if your change rate isnt huge. Removable disk which are robust (like RDX) are only suitable for small volumes. This leaves you with good old tape."
I am reminded of a Usenet sig line many years ago
"Nothing beats the bandwidth of a truckload of tapes".
That might not be as true any more, but tapes are pretty robust.
You can drop them and they'll probably work.
They don't have pin connectors which can get bent either in transit or inserting them into a disk array.
There's also the server room footprint of a tape robot versus a disk array to consider.
I still like tapes for long term archives, but please give me a backup program which has CRC error correction built in. That can make all the difference with older tapes.
Dropping tapes is often pretty terminal. Problem is though is you unlikely to know until you try to restore a specific bit of data.
Dont know where you got the concept of "bent pins" as a a real world problem but as people dont transit disk for offsite media generally (offsite disk storage is achieved by sending the data offsite, not by moving the disk), who knows...
Tapes actuallly have LOADS of error correction built in and have done since at least 8mm helixal formats. DLT and LTO have massive amount of not just CRC checks but CRC correction. Consequently there is absolutely no point in putting that stuff in software as the hardware does it already in a fraction of the time.
Indeed, if you want to verify the intergrity of an older backup, just check to see if you can read the tape. If you can , it means that its passed the CRC checks / correction.
"Dont know where you got the concept of "bent pins" as a a real world problem..."
Well, er, that was the idea behind the Windows Server 2008 backup scheme. I assume that someone somewhere is doing that.
"DLT and LTO have massive amount of not just CRC checks but CRC correction. Consequently there is absolutely no point in putting that stuff in software as the hardware does it already in a fraction of the time."
We went through this argument 15 years ago and again a decade ago. CRC in backup software covers the entire path from disk to tape. It depends how paranoid you are. Disabling CRC in software was a performance gain before the mid-90s but in my experience by the late 90s it wasn't an issue. YYMV.
Biting the hand that feeds IT © 1998–2022