Don't bother with the 'compressed'
A lot of stuff for long-term (i.e. tape) storage would be compressed before it go to the tape drive
Library vendor SpectraLogic is preparing for upcoming 12TB LTO-8 format tape drives with a pre-purchase programme and a 190-plus TB cartridge on the way. LTO-8 doubles capacity and increases throughput by 42 per cent over the current LTO-7 format generation. It will support LTFS, WORM and AES 256-bit encryption and has a 1019 …
A lot of stuff for long-term (i.e. tape) storage would be compressed before it go to the tape drive
I agree you won't get the claimed compression rate very often, but it depends on your requirements whether you would want to compress before taping. If you are really after long term storage, LTO compression is a good choice, because the standard requires that all manufacturer's drives can decompress the data. If you compress with some external tool, are you sure the one in common use will decompress your archive years after the people responsible for archiving it have retired?
I have written 1/2" tapes and read them back 30 years later. (Admittedly not LTO tapes, obviously). The operating system used to write them is no longer available.
I have managed tor recover most data from 20 year old DDS tapes. If I had used one of the compression options available with versions of tar around when I wrote them, I would have to spend days figuring out how to configure it to uncompress the data, assuming I could find a suitable version of zip, winzip, gzip, or bzip or whatever (I might have opted for whatever that compression thing was on CDC mainframes, if I was in the mood - who was to know Seymour Cray was going to leave? - or IBM Squoze - as recommended by my Mum!)
The problem with LTO is the capacity to ensure that you will have a drive capable of reading legacy tapes.
Our LTO-6 drive can't even read LTO-4 tapes and is often the case the archives are stored in a vault for years whilst IT refresh has gone through several upgrades/changes etc...
On top of that how does one ensure that the Software is also still capable....
Well that is a process problem. The fix is to start using TSM or Spectrum Protect as they like to call it these days and stop taking tapes out your library, just have two libraries with a copy of the data in both of them.
When you upgrade to new tape technology you just get your TSM servers to copy the data to the new tapes. Might take a few months, and you will probably have to once a week take some old tapes out and feed new ones into the library, but you then end up with everything on the new tapes. Rinse and repeat each time you upgrade to a new generation of tape.
My top tip is to keep a couple of the old technology tape drives in the library along with some tapes and dedicate them to TSM's DB backup. That way your DB backups never get blocked, which is a very good thing.
how does one ensure that the Software is also still capable....
Use old Unix (BSD) tar to write to the tapes. Not Star or gtar, or any other newfangled tar. The original one.
Choose a blocking factor and stick to it. The "default" setting is not very portable over time (used to be 20 card images of 80 characters, now mostly 20 of 512 byte blocks).
Set the Block Size to variable, because fixed block sizes appear not very portable at all with different manufacturers: does 512 mean blocks of 512 bytes or 512 blocks of 512 bytes or 4k bytes, or something else? (depends which day of the week it is, mostly).
Use open source software, and save a copy of the source where you can find it. (probably not on an 8" floppy).
Do any fellow commentards know of any study that attempts to guess at the total amount of storage capacity the world has, broken down by media? I mean, there must be billions of SD cards, tens of billions of hard disks; CD's and DVDs could be looking at trillions - but do sheer numbers outweigh tape in terms of raw bit storage capacity?
I know there are different use cases and data lifetime plays a factor (how much of the data capacity on CDs is no longer readable?), but I imagine this is a question someone with better resources to hand may have asked.
Someone must have historical data for "units manufactured", and recent data will dwarf any inaccuracies from incomplete data from anything more than a few years ago.
Perhaps a suggestion for xkcd to look into...
Probably not, so it would be nice if LTO quotes in future are raw and let the user assume responsibility for optimizing its use through compression, deduplication, etc.
Having said that, I miss the Travan days. At least back then, tape drives were within consumer reach and provided us with at least some means of offloading cold data in the days when 1GB of data was a premium. A tape system accessible and affordable to the consumer in tiers of 2, 4, maybe 8 to 12TB for packrats would at least provide an alternative to external drives which can have reliability issues. At this stage, the only one within reach (and at a stretch) is rust-based RDX. Longevity doesn't even have to be so strong. Five, maybe ten years on the outside would be enough to handle a move between generations if need be.
+1 here. Still miss my old QIC tapes. (We need an old geezer icon!)
However, in my more clear-eyed moments I think I am really only pining for a psychological illusion. You knew your backup was proceeding because you could hear the tape drive whizzing and sometimes see the tape moving along. You don't get that physical feedback from hard drives* and you especially don't get it from SSDs. But there's no reason that fancy rust on a very long rectangle of plastic is better than fancy rust on plastic circles.
*Yes, I miss the little chirps that hard drives made circa 1992 too.
We use old 1.44MB floppies, and run a 1E9:1 compression ratio.
LTOs 4 and 5 used a 2:1 compression ratio. Later generations moved to 2.5:1.
Seriously, this is one of the most stupid things I've heard. Did data suddenly get 25% more compressible? It should be illegal to sell those numbers.
No, not more compressible but don't forget that even though the algorithms are fairly simple like that used for say NTFS compression that was introduced in Windows NT 3.51 it could easily reduce the size by around 50% for easily compressible data (whereas WinZip or WinRAR would shrink it down to say an eighth of the original size) .
Newer tape drives will have better algorithms and even if they don't they will most likely have larger memory buffers and work on larger chunks of data and something like LZW compression even back then would have gotten better results if you used a larger workspace going from a 128 KB buffer to say a 1 MB one and the last time I checked these drives may have like one GB of on-board RAM so its not inconceivable that they can get this sort of compression. Obviously if they get fed like Mpeg-2 data or random numbers then compression will be zero.
Is the compression algorithm something like this. Processed in binary, remove all zeros as this is just wasted space and dedupe the ones. By my reckoning you have invented infinite compression. Well done, if you will excuse me now I need to go an oil my perpetual motion machine.
I'm quite happy backing my 1.2TB of data in 12 hours to my LTO-5 drive.
The lack of support for WIndows NTBackup in Windows versions later than XP and Server 2003 is one of the main reasons I haven't "upgraded" to a later Windows version.
Unless you like to roll your own scripts using vshadow.exe (Did this for use with rsync. Yeah I'm a glutton for punishment) Microsoft still threatens to pull the VSS service, but its the only thing that lets you backup files in use, registry hives etc, ending up with a complete backup vs Documents (If your lucky)
The best option for most, is to subscribe for a license to Archronis. Uses VSS, and can provide cloud options if you want, but the local backup is the go.
(Disclaimer: I hate subscription models. I think Archronis is what ntbackup should have been, but it could be more versatile. I think its about double the price of AV but IMHO Its the only option)
"The best option for most, is to subscribe for a license to Archronis. Uses VSS, and can provide cloud options if you want, but the local backup is the go."
I have a better one:
a) run windows [if you must] in a VM hosted on Linux or BSD
b) use 'tar' on the host to do your backups (when the VMs are shut down)
Also when you shut down the VM you can probably export it [compressed] into a form that backs up the drive image in a way that lets you quickly restore it. I do this with virtualbox all of the time, do a snapshot of the VM via 'export', saved as a single file. Re-importing that back gets you all of your VM settings too.
Alternately, shut down the windows machine, remove the hard drive, put a USB:SATA adaptor on it, plug it into a Linux box, use FuseFS's NTFS to read files [or do an image backup via 'dd']. Or you can use a bootable "disk recovery" DVD that has some version of Linux on it with tools to do "all of that".
and if you had a tape drive, an image backup might be the best option, like what 'norton ghost' used to do for ya (or may still, is it still around?)
anyway, with Linux or BSD, or especially with a VM, your backup options are a LOT better.
it would be nicer if the Cygwin tar program could be used directly
bad side is how windows does its drivers. I doubt any attempt to embed Linux shells within Win-10-nic will fix *THAT*.
/me wonders if using xzip with tar + tape will give you better compression than 2.5:1 on most things... [I've never actually tried that, just created 'txz' tarballs which are compressed VERY well!]
"Still miss my old QIC tapes. ". Interesting devices. For me it was the nice 10.5 inch reels of 1/2 inch tape. On a good day you could back up a single disk pack (IBM 2314) to one reel. There were times when a tape merge/sort could be faster than one on a disk drive set.
It has some a long way. I started out with 7-track 200 bpi (IBM 727 tape drive). LONG time ago.
all of this actually sounds pretty cool. I used to work for a tape drive company (Cipher Data) in IT, and later assisted with IT-related things for a few of the companies that bought their tech, over and over [getting databases converted and loaded, things like that]. First Cipher, then Archive, then Conner, then Overland Data, mostly with the 9" reel and 1/2" cartridge stuff. It's amazing how much data you can cram onto tape these days, compared to the old tech. QIC tape kinda died but reel and cartridge were still kicking through the 90's.
[As I recall a 9" reel was something close to 500mb depending on the format and old minicomputers typically had a front-loader drive where you'd put the reel in like a videocassette]
For a very rough estimate, Amazon will sell you a "1.5-3 TB" tape for about 20 quid which means that the price per TB is about half that of spinning rust from the same emporium.
(Obviously that's not the tape product being discussed here, but hopefully the unfairness of my comparison will provoke someone who actually knows about the subject.)
Except you have to account for the cost of the drive as well as most don't have one. And even taking into consideration most need SAS (server-grade drive tech, not available to most people) to keep it fed, you should see the price tags for recent LTO drives. Definitely NOT consumer-level stuff.