back to article Guess how much stored data is ever used or accessed

NetApp's Chief Technology Evangelist, Matt Watts, is worried about sustainability and data wastage, even as his employer withdraws third-party support from BlueXP classification. In a 2023 report, just before the IT world became obsessed with AI, Watts wrote a foreword to a report [PDF] that made clear just how bad the data …

  1. Pascal Monett Silver badge

    So, up to 80% of stored data is unused

    Not a problem for the NSA. It loves trawling through all that unused data to try and find something under the paranoia du jour.

    Unused data is not useless data, and it'll be a cold day in Hell before your average manager authorizes the complete and irrevocable distruction of data he hasn't looked at since at least a decade.

    Just in case, you understand.

    1. Snake Silver badge

      Re: So, up to 80% of stored data is unused

      It's true, probably over 80% of the data we have stored here is 'unused'. But they serve as historical data backup, yes probably unused for the rest of time but, in business, you never really know when some crazy customer from 20 years ago asks for something and you have the data to reference it all...

      (and it is actually not that crazy, it has happened twice to me in the past 7 months. So who really knows??)

      1. Doctor Syntax Silver badge

        Re: So, up to 80% of stored data is unused

        There are also such things as legal requirements to hold data for 6 or 7 years or whatever. The Horizon & Covid enquiries in the UK are showing the significance of stored records on one hand (especially for those witnesses who were able to produce a paper trail) and the significance of having deleted Whatsapp messages on the other.

        1. tiggity Silver badge

          Re: So, up to 80% of stored data is unused

          Legal requirements definitely a big thing.

          Some customers just love to hoard data though.

          We provide tools (with lots of flexibility) so customers can purge data that is no longer required for regulatory / legal reasons - those tools generally sit around metaphorically gathering dust.

        2. ChrisElvidge Silver badge

          Windrush

          And then there's the unused data the Home Office deleted referencing the "Windrush generation". Seems it would have been better to keep it after all.

        3. Snake Silver badge

          Re: So, up to 80% of stored data is unused

          7 years is the legal requirement in the U.S. It's the main reason that the entire business world still prints documents, we can't 100% trust that digital data version if the government comes knocking at our door with some type of audit but the drives crashed 4 years ago and Joe's Garage didn't make backups.

          As I mention I've got all the data from when I started at this place converting it to digital, 20+ years ago. Triple backups, of course, including off-prem. So when that legacy customer called me to ask about something they bought over a decade ago and I said "Hold on, let me look that up..." and I told them everything about it, when they bought it, how much they paid, the specs, etc, they were very surprised (because they were guessing what the specs were but I had the data to prove it exactly). Comes in handy and the only time my knowitall boss acknowledges what I do for him.

          1. Anonymous Coward
            Anonymous Coward

            Re: So, up to 80% of stored data is unused

            I'm in pharma manufacturing. Some of our data is legally required to be held much longer than 7 years. My employer has taken the stance of "keep everything forever". While I understand the premise, good luck finding a working drive for those 120MB data cartridges!

            1. Snake Silver badge

              Re: 120MB cartridges

              Agreed, you'd better get them migrated to a more modern medium before you truly can't find a single working drive on planet Earth!

            2. Anonymous Coward
              Anonymous Coward

              Re: So, up to 80% of stored data is unused

              > good luck finding a working drive for those 120MB data cartridges!

              *Everyone* should be running a plan to continually migrate their data, archived and live.

              When you started getting SATA-only hard drives in PCs, did you remember to keep some IDE-capable machines to read the drives you put into archive the year before?

              When you started to laugh at the idea of buying new machines with floppies, did you remember to copy all of the discs stored for safe-keeping? What about DVDs, now you laugh at the idea of getting a PC with an optical drive?

              If you are keeping ex-employees drives for x months (or years) just in case, are you sure the SSD will be readable?

              The USB drives with release-to-factory for the embedded processors in your products, do they still work? You absolutely sure you can rebuild those binaries?

              1. Anonymous Coward
                Anonymous Coward

                Re: PC with floppies

                I had kept a floppy drive "just in case".

                Several years ago I had a customer who wanted to restart a book they had been writing. Hard drive was over written. But they had 3 Mac floppies.

                I had to install the floppy in an old PC that didn't have one then got a driver so Linux could read the floppies.

                Then I had to wash the floppies disk surfaces to get the mold off. Never store your backups in the garage in a high humidity area of the world.

                Really should get a USB floppy drive. Not sure if more resent PCs can read a floppy drive.

  2. alain williams Silver badge

    Which 80% is going to be unused ?

    It is probably cheaper for organisations to keep more data than to work out what 20% they are going to need again. Even without understanding that requirements may change in the future.

    It is like backups: I would love to only have to do a backup just before I have some sort of failure.

  3. Chris Evans

    Backups! Unused data should be 90% plus

    I follow the rule that there should be three copies of current data, so 66% should be backups and then when you take into account historical copies I'd expect over 90% was unused. I'm sure many organisations have a lot of 'Unwanted' old data and many won't have a system in place to cull old data.

    1. ravenviz Silver badge

      Re: Backups! Unused data should be 90% plus

      I’ve not been convinced about cloud storage model for personal media so far, but this is what has been sold: get your stuff when you want it. Turns out this could be a problem. Most of the stuff that is stored online is crap anyway, I know most of my own local (and doubly backed up) archive is, but that’s my problem to solve!

  4. jokerscrowbar

    Not just the data. You never know when version 5.0 of a program might actually be the optimum version because v.31.5 is a subscription only, All your data belong to us POS.

  5. Anonymous Coward
    Anonymous Coward

    powered data storage

    Having data on servers is definitely IT's fault when they tell you that cold storage won't be permitted after the new system goes into service.

    At which point they found that they'd need an order of magnitude more storage than they'd assumed.

    Anon, because grrrrrr.

  6. trindflo Silver badge
    Go

    Ownership

    The owner of a file from a permissions standpoint can and does often change. A created-by or responsible party meta-field would be handy for spring cleaning.

    1. Plest Silver badge

      Re: Ownership

      Invest in some serious tracking software, we use Veronis, it's not perfect and it's pricey but we can generate reports on all our CIFS stored data in minutes having used it for about 8 years now. Very handy for arguing with users when we want to archive/old store their files that haven't been touched for years.

  7. This post has been deleted by its author

  8. Turkey_Bender

    I used to work for a company that specialized in storing and indexing documents for law firms relating to discovery. We called it "write once, read maybe"

  9. Anonymous Coward
    Anonymous Coward

    Big Players: Invest in training Digital Archivists and Librarians

    They are taught how to deal with this stuff and you'll profit comparing their salary to continued (especially cloud) costs for pointless data storage, knowing you already have data instead of buying in another fresh copy...

  10. Anonymous Coward
    Anonymous Coward

    That's a lot of redundant data sitting on powered servers

    I've got an absolute minimum of 50% redundant data, going up to at least 66% for stuff that I care about.

    I call it "backup".

    Oops, tell a lie - double everything up, the servers are all on RAID - that isn't the backup, that is for continuation of service.

    Redundant is not the same as unwanted (and neither is the same as unused).

    As one would expect most people in the tech world to know.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like