Re: Eliminating tape
Disclosure: I work for Oracle Tape.
You didn't search good enough, or you'd be aware that what you actually needed is a library supporting SDLT or SDLT320 drives (assuming the user had DLTape IV, God help you if they had DLTape III or earlier, but still doable).
In case of Oracle, it would be an SL500 or L180/700 -- both are end of life, but recent enough to actually find drives for in good condition -- those first and second generation SDLT drives are usually in very good working order and assuming you're migrating the data to disk or to new media, even tens of thousands of tapes isn't a scary prospect, since DLTape IV supported at most 40 GB natively per cartridge.
I don't know which DLT generation your user had, but even if you found a DLT-based library, you'd probably have problems with finding HVD SCSI HBAs to attach to the drives. The real reason you could not find a library sporting DLT drives is because it's been end of life for so long that it's obsolete by all modern standards and 99% of customers moved on.
Even if it was a problem getting a Storagetek library for your case, I'm fairly sure Quantum would jump at this chance.
About the retention periods -- you seriously think that using disk drives is going to solve this? Suppose you put it on a MAID array today using state of the art 16 Gbps Fibre Channel, 40 Gbps Ethernet or 3rd generation SAS. Are you sure you're going to be able to access that array in 15 years?
- It's impossible to access first generation 1 Gbps Fibre Channel arrays with 8 Gbps HBAs and switches. That obsolescence came in just 12 years. It was impossible to find new disk drives to replace failing ones 7-10 years from introduction of these arrays.
- It's not possible to connect 10 Mbps Ethernet to some 1 Gbps switches, and to no 10 Gbps switches. Not to even mention coax standards. It's probably easier to find legacy consumer stuff for this and step down with switches supporting lower speeds, but if you said that's your solution for future access that array, you'd be laughed out of the data centre.
- Like Fibre Channel, SAS only supports negotiating a link down to two generations back. Next SAS generation will not negotiate a link with first generation SAS.
And now let me go over your points:
1. We still support 9840 tape drives in new tape libraries (SL8500 and SL3000), originally introduced 1998. Heck, we still support 9490 tape drives, introduced 20 years ago (although the libraries in which they are used are end of life). New T10000D drives still support reading from cartridges written by T10000A drives introduced in 2006.
2. That's completely irrelevant. How is that an issue with tape? It's exactly the same regardless of whether you use tape, disk, flash or anything else today.
3. That wasn't a problem since basically forever. With 9840, you can access over 50% of blocks on tape within 8 seconds of mount, and any block on tape within 20 seconds. If you know which file mark you're looking for, this is stored in the media information region. Same applies to all modern tape formats, which take at most 90 seconds to spool the whole tape if it turns out that the data you're looking for is at the end of media. Serpentine writing means that the data is more evenly spread across tape.
With LTFS, it's even easier, since the tape is effectively presented as a block device to the OS -- there are two partitions, one has the file layout, the other is actual data.
True, it's still impossible to read data backwards, so if the file is stored over the entire length of tape, but starts at the end of it, it will still add 90 seconds overhead to reading the file.
4. It's called Storagetek Tape Analytics and it's meant to do exactly what you say here -- mount a tape at preset intervals, read the media information region and either do a full tape read or read random bits to verify it's not degrading too much.
Re-writing will occur if the margins are getting too thin.
And there's now Xcopy to seamlessly move data from one cartridge to another without host involvement. There's a lot of exciting stuff happening that you're completely unaware of.
How about efficient physical delete on a disk drive? Oh, not possible? Again, how is that a tape problem specifically?
Efficient physical delete on tape? A few seconds in a degausser does the trick. The tape is completely blank and unreadable, including the servo tracks, making it completely impossible to read from.
And with hardware-based encryption, there's really no reason you should worry about logical deletes.
5. Again, it's not a problem specific to tape. If in your organization employee attrition, changing priorities and laziness allows anything to get out of control and ignore processes, you have much bigger issues at hand than worrying about tape obsolescence.
6. So, disk drives don't deteriorate, huh? They do, and much faster than tape since magnetic domains are much smaller. Seriously, if you only write to tape once (as should happen in a proper archive), the retention period is way more than the guaranteed 30 years.
7. Disk drives don't dedupe, either. So what? There are three approaches to deduplication on tape:
- Don't dedupe. Retain integrity in every object/file you store. That prevents any problems with being unable to read from tape in the future.
- Write raw data from your deduplicating arrays to tape. It's the most efficient method, but only if your array supports that and you're sure the manufacturer will be around when you need to restore the data. It probably makes sense for short-term backups when you don't lose track of data and would need to restore specific portions of your storage, but definitely not for long-term archives.
- If you have a lot of similar files (that dedupe well), offload them in a single compressed image to tape -- or in multiple images, where the deduped blocks are stored in line with the rest of the files. It's a compromise and it requires some capability to read the data in the future, but it could work if your archive assumes you would only ever restore most of the files from the archive or when it's done well and you don't reference more than one tape.
Anyway, deduplication is a foolish solution for a long-term archive. If you did dedupe, you'd quickly have a situation where restoring a single file from archive involved reading bits and pieces from a number of tapes ranging from one per file to one per deduped block. And if you somehow lost the unique copy of some particular block common to all files in your storage system (as happens in improperly configured deduplicating solutions), you'd lose all data.
8. Here's a news flash: disk drives are not cheap. Enterprise drives are still over 10 times more expensive than tape per byte. And for enterprise products (like Oracle's T10000D), cost of storage per byte is lower than cheapest consumer hard drives today. An 8.5 TB cartridge costs about the same as a 1 TB disk drive.
Let me rephrase what you said: In a world of very cheap tape, putting EVERYTHING on disk is just plain STUPID.
And to rephrase your last paragraph: Any IT professional that doesn't examine the virtues of every available solution should be tarred and feathered. Horses are definitely nice animals, and they shouldn't be used to execute anyone.