back to article Digital memories are disappearing and not even AI or Google can help

I have too many artefacts detailing my digital history, which is stored on too many TRS-80 cassettes that hold sometimes-able-to-load BASIC programs, 3.5″ floppy disks, ZIP disks and HDDs. All those storage devices are crowded with the rolling backup of my life over the last 20-ish years. Before that, my digital history sits …

  1. Pascal Monett Silver badge
    Thumb Up

    A strikingly important article

    The subject of our memories is a very important one, and we're nowhere near a proper solution.

    I agree that file storage works, but file retrieval is indeed a thorny subject - made worse when the storage medium differs the older the file date.

    I also have a collection of DVDs, HDDs and now BluRay discs that I store my digital life on. When VHS went out of style, I used my living room DVD player with hard drive to convert from VHS to MP4 and voilà, all those files went to DVD storage (and are now also backed up on BluRay). What happens when switch to crystal cubes with yottabytes of storage ? I'll be copying all of that to the cube, of course.

    And then what ? My data organization isn't terrible, but if I absolutely had to find that school programming project, I'm not sure I could do that in a single day. Might take me more than a day or two to go through all those DVD labels to guess, or check, the right one. And yes, I have an index. Somewhere. Going to have to find that as well, but I think I know where it is.

    So yeah, I'm rather looking forward to the day I can just ask my robot butler for that project folder I saved in 1998 and see the folder appear on my whatever-a-PC-will-be-by-then.

    And then I'll check the contents, find that they are useless and chuck the folder to the bin.

    1. DJO Silver badge

      Re: A strikingly important article

      Good luck - I put loads of stuff onto DVD-R and put the discs in a proper folder and kept the folder somewhere without wild fluctuations of temperature or humidity.

      Maybe 25% of them are readable.

      Oddly the few rewritable DVDs I had were all perfectly readable which is the opposite of what I would have expected.

      1. Mike 137 Silver badge

        Re: A strikingly important article

        "Oddly the few rewritable DVDs I had were all perfectly readable which is the opposite of what I would have expected"

        Quite a few tests over the years have found this to be true. The two problems with write once CD/DVD are [1] that the dye which contains the information fades or {2] that the metallised reflective layer corrodes reducing the contrast below readability. The rewritable technology uses a heat induced phase change in a tellurium alloy between amorphous and crystalline states. This changes its reflectivity so there is no need for dye or separate reflector layer. Although in theory the phase change can revert spontaneously over time this has been found to be so slow (typically in excess of 100 years) at normal temperatures as to be ignorable. Usually, only getting a recorded R/W disk much too hot will corrupt it.

        1. munnoch Bronze badge

          Re: A strikingly important article

          I still have all the floppies and micro-drive cartridges with the source for the ZX Spectrum games I wrote. Not to mention quite a few VHS tapes from a very early camcorder of the same era.

          Always had good intentions of reading and converting them but I fear it is now too late.

          1. An_Old_Dog Silver badge

            Convert Now! (?)

            1. We can spend our entire lives upconverting to modern media. If you look at https://en.wikipedia.org/wiki/List_of_cassette_tape_data_storage_formats you'll see there are at least nine persoal computer cassette tape formats, plus some sub-types where manufacturers changed their own tape formats. Then there are 5.25-inch single-density floppies, double-density floppies, double-sided, double-density floppies, 8-inch floppies, 5.25-inch floppies, high-density floppies, 3.5-inch single-density floppies,3.5-inch double-density floppies, 3.5-inch high-density floppies, 3.5-inch Microsoft Distribution Format ("DMF") floppies, 3.0-inch floppy diskettes, the Quarter-Inch Committee tape series - DC2020-style types (ran off floppy diskette controllers, recorded data in "serpentine" tracks), DC 300, DC 600, DC 6525, and DDS, etc., not to mention various homebrew and proprietary schemes to back up computer data onto video tapes, streaming tapes, and ...

            2. I tend to self-index, both via directory names, and file names themselves (see item 3, below).

            3. *nix find is slow, but flexible. Metadata-indexing file systems may misleadingly- or over-specifically index. Some will record/search for the date the file was added to the file system, vs the date it was created (on another system or device, and then later copied to my PC). Further, even if the OS "understands" photo EXIF data, I may not want that used as the photo index. It may be far-more-useful for a group of photos taken over the course of a month to be indexed by, "Thanksgiving 2023 Train Trip".

          2. Anonymous Coward
            Anonymous Coward

            Re: A strikingly important article

            I have one of these (http://www.deviceside.com/fc5025.html) in my desktop, connected to a 5.25" floppy drive. It lets me copy files off, or do disk images, so it works for booters as well as normal software and data. Shockingly, the only disks that fail are the ones I remember being in bad condition in my childhood - generally, the disks are as readable today as they were 30 years ago.

            Now, whether it's possible to convert the CONTENTS to something useful is its own problem. Anyone have a Symphony-to-MSOffice converter?

        2. hitmouse

          Re: A strikingly important article

          At one point I had over 3000 music CDs. Now I have maybe 200 collectable/memorabilia-grade CDs amd everything is digitised, booklets and all.

          One of the reasons I accelerated that move was finding so many of the CDs (and a good few DVDs) had degraded beyond recall. As often as not, the nice collectable box-sets rather than the cheap compilations.

          Between local backups and the cloud, I don't think I've lost anything in 20 years and the metadata makes everything very findable. In fact with cover scans and PDFs of booklets being searchable I often turn up interesting details that may have been lost of on a shelf.

          It would be nice if the algorithms that Google and Microsoft employ for photo management allowed such recall, but they have a lot of problems with dates even when the EXIF metadata is correct, so I invariably rely on subfolders with dates to locate stuff.

          1. Dr. Ellen
            Pint

            Re: A strikingly important article

            I stated using a TRS-80 III and Scripsit in the mid-eighties. Unreliable cassette tapes indeed, but there was a lot of writing done. We bought a TRS-80 IV because it had a floppy drive and could read model III cassettes. Could read them if they were recorded at the proper speed, that is.So we had to run everything into the 3, re-record it, and save it on floppies. We also had a Commodore 64. The two-headed Commodore 128 could read the 64 tapes and turn them into CP/M disks. I don't remember all the details, but we eventually got everything onto an MS/DOS machine in DOC format. DOC went through quite a few changes itself, but we kept up. Today, I have a couple terabytes of data, painfully transferred from one machine to another, one format to another, and one storage device to another. And whenever possible, I save files in RTF, JPG and other reasonable formats that seem like they will last, and back up on two terabytes of spinning rust. The cloud? Too many strange things happen to data there,

            But I still maintain an external floppy drive and an external DVD drive just in case something shows up.

      2. An_Old_Dog Silver badge
        Meh

        DVD reliability

        I've found some of the purchased-in-a-store DVDs with movies on them have gone bad. Fortunately, I've been able to download the material from the Internet.

        1. Anonymous Coward
          Anonymous Coward

          Re: DVD reliability

          The problem is the data is stamped onto netal, which is laminated between transparent plastic (front/back). If the metal is not sealed correctly, the metal layer can oxidize and make the disc unreadable.

          I discovered around 2008 that about 1% (12 of 1200) had this problem and then spent more than 5 years (elapsed, not full time) of my spare time copying to mp4 for archive and backup. Backup is now a annual chore (40TB)

  2. Andy The Hat Silver badge

    Information evaporation

    If I want to look at the Doomsday Book, I basically can. It's historic and tangible and readable by anyone with the correct public hardware (eyes), public software (language) and access to the datastore (library shelf or whatever). A reference to Guam the Testiculate of Goth and the enormous size of his vegetable plot may be inscribed there forever.

    Nowadays, a little titbit of information about the size of Horatio Pugh's allotment in 2003 and registered by Boring County Council, is found under a glacier of "more popular, irrelevant but sponsored hits" on page 83 of a Google search - if you are lucky. More likely, that reference is encoded in some kind of inaccessible archived database format which is never going to be interrogated by a search engine so it will never to see the light of day again, even if you could find something to read a .p34R format file, and under GDPR shouldn't data that old be deleted anyway?

    In essence, as more and more contemporaneous information piles in, the old information evaporates away, unnoticed, into the information aether. The similarity to an information black hole is significant.

    1. Peter Gathercole Silver badge

      Re: Information evaporation

      It's obvious what we need.

      The results turned up by Google need a search engine, maybe Google2.

      On second thoughts, can't happen, as Google would still want to put their sponsored or paid-for crap at the top of the list.

      Rather making them recursive, maybe use Bing to search through Google's hits?

      No, that won't work either, it will just give two ad-slingers the chance to pollute what you want to find.

      1. Doctor Syntax Silver badge

        Re: Information evaporation

        DUckduckgo

        1. jmch Silver badge
          Trollface

          Re: Information evaporation

          Or rather, in this case, DuckDuckDuckDuckGoGo

    2. Doctor Syntax Silver badge

      Re: Information evaporation

      It's not necessarily as easy as you might think. I have the Yorkshire part of Domesday as image, as an old translation from the Journal of the Yorkshire Archaeological Society and a massive Penguin paperback translation of the whole thing. The translations don't necessarily agree with each other while reading the original as image isn't easy and even then you're stuck with medieval administrative Latin. Then there's a paper of Stenton's from the 1920s where he argued that's what's written down might have been a misreading of the originsl notes...

    3. doublelayer Silver badge

      Re: Information evaporation

      Your comparison doesn't really work. The paper records you're extolling only exist because they were significant enough to be kept or because someone left a copy and someone was lucky enough to find it. If you want any records that weren't in that book, they're not hidden in an archive for you to eventually find them with enough effort, nor are they in an old file format you could eventually access with some effort, they are gone. Gone in the sense that the paper you were written on was used as fuel, or it was never written down in the first place. It's not that bad if insignificant records are hard to find because Google doesn't think anyone wants to find them. In all previous times, they were also insignificant and nobody wanted to find them, and that usually meant that if someone did want to find them, they had a much poorer chance than they do today.

    4. C R Mudgeon
      Happy

      Re: Information evaporation

      "Guam the Testiculate of Goth"

      I think he's the first cousin once removed of my back-in-the-day friend's back-in-the-day D&D character.

  3. abend0c4 Silver badge

    Preserve the meaning of our personal past

    Do we really need to?

    The archive of my earliest digital forays was largely on green bar paper and punched cards. It gradually got reused for sketching flow charts and code fragments and leaving notes for the milkman. Did it have any value other than as an occasional source of nostalgia? Almost certainly not. I have electronic records of a lot more since, but most of it has no intrinsic interest even to me.

    There was a period when slide projectors were fashionable when almost every sitcom had an episode in which the protagonists were forced to sit through an interminable exhibition of their neighbours' holiday snaps - famous landmarks obscured by one or other determinedly grinning face. I think we're frequently guilty of the digital equivalent.

    Just because we can store stuff doesn't mean we should. People who hoard physical items, filling their homes to their own exclusion, are rightly considered to need help. The expanding capacity of digital storage may simply be concealing our propensity to overestimate the significance of our existence.

    I have some stuff that I have specifically set aside for future generations - photos of relatives going back to the late Victorian period, for example - but most of the rest will evaporate with my own demise and if it were to vanish sooner it would not be a tragedy.

    1. imanidiot Silver badge

      Re: Preserve the meaning of our personal past

      The tricky thing is that it's hard to determine which (if any) of those "worthless" personal snaps and digital records turn out to have some value to a researcher in the far future.

      1. Lurko

        Re: Preserve the meaning of our personal past

        "The tricky thing is that it's hard to determine which (if any) of those "worthless" personal snaps and digital records turn out to have some value to a researcher in the far future."

        That's true, but we can still make some reasonable guesses at what they won't require. I was recently going through some 1880-1950 B&W family photos, and it's obvious that duplicates are of no value, near enough dupes are of no value, photos of people where the subjects are unknown are of next to no value, and photos of exceptionally poor quality are of little value. In some cases the location and date are unknown, meaning that a picture of an event has little or no value. And anything that is reliant on Aunt Edna being able to say "that's your great great uncle Rupert at a camp when he was out in Sudan in the 1920s" is at great risk of losing it's value when Edna is no longer with us, unless that's written on the back of the photo, or a digitised version has it attached as some form of metadata.

        The proportion of actual information retained from centuries prior to the 19th is trivial - official and church records, newspapers, a tiny number of artist's pictures, a tinier proportion of books and diaries, are we really any the worse off for not knowing in much personalised detail how grim most people's lives were in the 17th and 18th century? What will people of the 23rd century make of the bottomless petabytes of driveliferous data the 21st is in danger of leaving behind?

        If anybody wants to preserve an unmatched social archive of the early 21st C, then rather than seek to digitise each individual's personal records, all that's needed is to preserve some slices of the big social media sites. That way would keep news, attitudes, discourse, relationships, photos, personal life stories, all with their unique contexts.

        1. Paul Kinsler

          Re: photos of people where the subjects are unknown are of next to no value,

          Are you sure? It might be that some future digital archaeologist will more interested in the clothes worn, ther things carried, or even what might be shown - perhaps by accident - in the background of our photos than the notional "subject".

          1. Andy The Hat Silver badge

            Re: photos of people where the subjects are unknown are of next to no value,

            That's only nearly completely wrong by only referencing "people".

            I, for instance, have a lot of barely in focus pictures that in themselves and to most sane people are completely useless. However, referenced together with other pictures taken at the same time, it turns out that they actually contain key insect id information (a leg or a wing for instance). Without them the rest of the set has much less scientific value.

            I'm also currently searching for a grave - the *only* reference I have to it is a faded black and white photo of a flower-covered internment from the early 60's. No name or clues to who is buried there but important enough to be worth taking a photo of at the time. Completely irrelevant to anyone apart from my family (possibly) but may form a part of my family history or could be where the family silver is buried ...

            Basically you cannot predict what *may* be important to a future historian without actually being one ...

      2. doublelayer Silver badge

        Re: Preserve the meaning of our personal past

        With the size of the trove we're likely to give them already, it's a bit egotistical to assume that any person's data is likely to be of much interest. If we destroy 99.9% of the photos taken since 2000, we're still going to have orders of magnitude more of them than we had photos from any time before that. It really isn't the same as researchers looking at things from millennia ago, when creation of records was expensive even to generate a fragile, irreplaceable version. Nowadays, it's easy for one person to generate a hundred photos of daily life and to have copies on three continents in an hour. Researchers will have more of a problem digging through photographs to find useful photographs than they did digging through dirt to find objects.

        That is if they're still humans looking for something they didn't already know. The depiction of researchers in this forum makes them sound a lot more like aliens digging through the ruins of a planet we destroyed looking to understand how our civilization worked. That picture might approximate what modern archaeologists are doing to understand neolithic society, but I really think it's incorrect to imagine that future archaeologists tactics with our society would look identical.

    2. that one in the corner Silver badge

      Re: Preserve the meaning of our personal past

      > People who hoard physical items, filling their homes to their own exclusion, are rightly considered to need help

      Between ascetic minimalism and the point where hoarders literally exclude themselves from there own homes there lies a very large range of options. The same behaviour can lead to dramatically different outcomes purely due to circumstance: the avid collector of foibles is either an interesting old duffer, whose glass cases of oddments are preserved for future visitors to his country house or a leathery corpse buried under piles in the bedroom of his two up, two down mid-terrace.

      With modern storage, we can all choose which route to take: and we csn apply as much organisation as is useful & comfortable to ourselves, right now: if you would *like* to be able to find that photo of the funny kitten, you can decide how much effort it is worth it to yourself, today[1].

      But for the interests of the future:

      > but most of the rest will evaporate with my own demise and if it were to vanish sooner it would not be a tragedy.

      Find yourself an archaeologist and ask them to talk to you "middens". Take a packed lunch, you will be there a while.

      The illregarded and the discarded are what paints a picture of everyday life: our future researchers won't be getting that from carefully preserved presentations from the movies or TV (if every pub exploded with the regularity of the Old Vic we'd need flack jackets to safely visit the village corner shop!). In terms of bulk personal digital storage, "discarded" is not so much the contents of the Windows Recycle Bin that you never bother to empty, it is more the files you put down, can not for the life of you remember where they were, but there is plenty of space left, just forget about them.

      Even if we haven't got the software tools to automagically organise and make sense of all this stuff today, I think we can safely act as though it will become available, one day.

      For now, we just need enough to keep ourselves supplied with virtual Victorian glass display cabinets - and have a Serious Think about how & where to keep all our old NAS devices so once we've popped our clogs they aren't just wiped.

      Maybe the future would *want* us to keep putting PCs and NAS drives into (designated) landfill, just so that they know which techno-middens to dig in once they have perfected the datamatic trowel and measuring stick.

      [1] FWIW these days I seem to spend up to day a week - being retired - just adding cross-references and filling in blanks of "things that I won't forget, no need to write down" in my personal Wiki, which has been keeping track of things for me since the late 90s; all just to make it work for me when I do want to recall something. Oh, how I wish I'd dome more of that cleaning up as I went along.

      1. abend0c4 Silver badge

        Re: Preserve the meaning of our personal past

        But for the interests of the future

        I think the nexus of this argument is in the intersection between "interest" and "benefit".

        I don't doubt almost anything could be of future interest. The Vindolanda Tablets are probably a good example of this kind of archival material - considered at the time too trivial either to keep or (effectively) destroy - but which have become more interesting with hindsight; whether we've significantly benefited from their accidental preservation is I think a more complicated question.

        To take a deliberately more provocative example: are we better off for the preservation of Shakespeare's plays? The initial instinct is to say that had they not been preserved we would have been deprived of some of the world's finest literature. But equally, if you look at the huge cultural bandwidth that is consumed even today by these historical works, you have to ask how much contemporary cultural creativity has been displaced over the intervening centuries by their dominance.

        And this is where my concern lies: there's only so much human attention available at any point in history. What makes history tractable is that much of the record has decayed. Are we better off with huge volumes of perpetual data that is beyond our capacity to even inspect? And in the insistence on preserving the trivia of our own pedestrian existences are we drowning out what might more likely be of future relevance?

      2. Korev Silver badge
        Pirate

        Re: Preserve the meaning of our personal past

        Maybe the future would *want* us to keep putting PCs and NAS drives into (designated) landfill, just so that they know which techno-middens to dig in once they have perfected the datamatic trowel and measuring stick.

        My computers and NAS have encrypted discs and the keys are in a password manager with the password only known to myself. If I've done my job correctly then bringing the data back would be impossible for a thief but also relatives or archeologists...

        1. Calum Morrison

          Re: Preserve the meaning of our personal past

          If computing continues to accelerate at the rates we've seen, a kid's toy of the future will have the power to decrypt all your strongest encryption. That's almost guaranteed.

    3. Oneman2Many

      Re: Preserve the meaning of our personal past

      "Do we really need to ?"

      Having put together a photo & video montage for 2 funerals this year and assisting a family member with somebody who has dementia I can say with absolute certainty do not delete anything. What may seem pointless right now can be super useful in the future for your own consumption if not for general public.

    4. jmch Silver badge
      Meh

      Re: Preserve the meaning of our personal past

      "Just because we can store stuff doesn't mean we should"

      Absolutely this!!! I'm sure that somewhere in my parent's house there are still boxes full of my old school work and other such muck. Maybe one day my grandkids might come across it and wonder at what that flat white stuff with coloured marks on it is, and why pinching or pulling it will tear it or crumple it up rather than zoom in/out. Far more likely it hasn't been looked at since the end of whatever school year it was, and it's never really going to be looked at again - It might as well have been thrown out straight away.

      My hard drives are full of Terabytes of digital photos, many of which I will highly likely never look at again. Even my personal data is being generated at a far faster rate than I can ever hope to keep up with, let alone all of the stuff everyone else is generating that I might potentially be interested in. Search engines 20 years ago used to work on the principle of collecting as much data as they possibly could, because there was so little information on any topic that you had to collate it all to get a reasonable chance at finding what you want.

      Now we are in the opposite reality, there is far too much junk... and yet search engines continue to work in much the same way, collecting all the junk and ranking them according to which is the best SEO-optimised!! When rather, we need search engines that actively weed out sites that are auto-generated mirror images of each others' spouted SEO-optimised bullshit and remove them from the search results completely.

  4. JustAnotherDistro

    What an unexpected satisfaction it was to read this article, so beautifully written. Thank you, Mr Pesce and The Register, both.

    It captured, for me at least, how the creative minds that have given us an "Information Age" is coming to experience the universal generational griefs of aging and obsolescence. The imagery equating a lifetime of memories to an archive of mostly inaccessible or corrupted digital storage media--right on.

    1. Mike 137 Silver badge

      Seconded

      "What an unexpected satisfaction it was to read this article, so beautifully written"

      I agree -- articles that put current issues into clear context are increasingly rare.

      But I should add that not only digital information is volatile. Printed books are increasingly suffering from two effects that militate against durability. The first is poor quality binding. Almost all (even 'hardbacks') are now built from a stack of loose sheets embedded in glue at the spine edge. This cracks quite soon, particularly as it prevents the books being opened flat for reading unless the binding is severely strained. One quite important 1000 page 'hardback' in my possession had started to break up by the end of the first reading. The second effect is a growing tendency to pad informational texts with irrelevant fluff (quite possibly because in the modern commercial publishing model a book must have a minimum number of pages to be economic to print). However this reduces the value of the text as a whole because much of this 'noise' has to be skipped in order to extract the important information. For example a book I recently acquired purporting to be a long lost diary about conditions in a wartime army. Almost half the text was however about the feelings of the book's author on reading the said diary and researching for the book. And BTW the binding is both very stiff and weak as well.

      Clearly neither digital nor material media are now intended to last, so there's a great danger of our period of history appearing as a 'dark age' to future generations.

      1. doublelayer Silver badge

        Re: Seconded

        "Clearly neither digital nor material media are now intended to last, so there's a great danger of our period of history appearing as a 'dark age' to future generations."

        Rubbish. Our modern media is going to last much better than did old media. A while ago, there might be a couple archives of issues of a certain newspaper. If a fire broke out in one of them and the other one got flooded, that might be it for some historical issues. Nowadays, when a university wants to make an archive of a paper, they don't have to budget for a large room and then hope that their administrators will continue to budget for keeping it watertight and not on fire, but some space on a disk. And the next librarian will make a budget request for newer disks and copy the lot over, since the new disks are bigger than the old disks and that data might be useful sometimes. Such archives exist on all continents.

        That's just the deliberate archiving. I have documents of no significance from decades ago because they can be stored easily. Had they been on paper, it is certain that they would have been pulped at some point, probably as I moved from place to place. I've watched many others do the same thing. They'll take physical photographs with them, but the rest of the material is discarded as useless, which it probably is. If it turns out that some of it is useful, it is much more likely to exist because it was easy to keep it by inaction. As more people adopt cloud storage, that will only increase as it gets copied onto more resilient infrastructure.

        Lots of data is lost nowadays, but so much more is kept than ever was in the past.

        1. Anonymous Coward
          Anonymous Coward

          Re: Seconded

          Move up to a sufficiently large spacetime scale, and the entirety of Humanity is a just a proverbial "flash in the pan".

        2. N Tropez

          Re: Seconded

          Obligatory xkcd - https://xkcd.com/1683/

          1. doublelayer Silver badge

            Re: Seconded

            Yes, I've seen that. That shows you that you can destroy digital data if you're careless and let lots of others archive it for you, and they end up doing that lazily, rather than doing it yourself. However, even compared to that, paper ends up the worse for it. Which is more likely to exist fifty years from now: a picture that was uploaded to Facebook or the photograph from one of my grandparents' houses that I collected after they died. I'll tell you: the former, because I have not digitized this bit of paper and am not planning to, and I'm pretty sure that if I die, it's going into the recycling bin. I'm not convinced that it will stay out of it when I'm alive. The picture will probably be in archives even if Facebook no longer exists.

            A researcher from that time would have the following options:

            Digital photograph: Okay, here's the best we have. People from the time would have been able to see a clear picture, but this one has been compressed and shrunk, so we'll have to analyze it more closely.

            Paper photograph: You see that mud? Part of that contains some fibers which came from a cardboard box. If you could retrieve that, some of the box was made from the paper of the photograph. The pigments on the paper were removed from it during the recycling process.

            Of course, who cares about a researcher from fifty years from now? They'll not need any of this. The fact remains, though, that the data must exist at that time in order to exist later, and it has to be preserved at all times. Copying the old social media photos archive to new media will probably be pretty easy in 2073, assuming disk sizes continue to grow. Storing all that paper is not very likely.

      2. Richard 12 Silver badge
        Boffin

        Re: Seconded

        The British Library (and the US Library of Congress) are tasked with preserving all that has been published within their regions.

        They get copies of all print media, and store them in huge, low-oxygen environments, with multiple indices for retrieval. They attempt to do similar with electronic publication, but have found it much harder due to the penchant for irrelevant crap like javascript.

        This is a relatively new idea.

        Searching their archive is a full-time job for researchers. For example, finding a few short stories written by a "T. Pratchett" in newspapers took two intrepid, experienced researchers several months. They knew the time period and likely publications, and had a physical cutting for one of the stories - but that was all.

        Had these been written before the Library started collecting daily newspapers, this task would have been utterly impossible.

        Had they been published on a website, if the Wayback Machine or similar indexed them then a simple text search would have found the initial cutting. Though whether they would have found the rest is a different question.

  5. Mike 137 Silver badge

    "not even AI or Google can help"

    Certainly not any online behemoth corp, nor indeed practically any web site. Cloud providers only retain information while [a] it's being paid for or [b] until the 'free' space is required for profitable purposes. When it comes to independent web sites, it's very evident that URLs often remain valid for no more than a couple of years (if for no other reason than the need to constantly 'update' to retain search engine position), so bookmarking in the browser is increasingly pointless and citing them in papers, reports and books is essentially a complete waste of time.

    1. Djwhite

      Re: "not even AI or Google can help"

      This issue has recently afflicted me. Spurred by a wonderful trawl through old hard drives, I thought I'd have a look through my bookmarks, see what I've brought forward with me from the past. Unfortunately most of them are now dead, the information that I considered important to mark the location of has disappeared into the ether. Try to find an old .mod file that you loved 20 years ago via Google using the vagueities of memory - "does it go la la la, or la de dah dah dah?" and it's impossible. AI is unlikely to be able to help in finding that .mod because how do you describe the feeling of music, or ambience of an old photo?

      I know that archive.org can be useful in some circumstances, but doesn't archive small, independently operated sites. Archive.is is handy, but how long will this last? Print as pdf cab do a passable job, but fails to capture the context. As Mr Peace said, sometimes a head crashes, or in my case a girlfriend "helps" by binning the old computer that was cluttering the cupboard.

      1. doublelayer Silver badge

        Re: "not even AI or Google can help"

        If archive.org doesn't archive a site, it either means that it was told not to or it doesn't know that site exists. If you want it archived, put in the URL and it will continually rescan it for changes.

  6. matthewdjb

    I deleted all my emails older than ten years. I just didn't have the courage to put the cut off at 5 years.

  7. Bsquared

    As a stopgap - desktop search engine?

    Excellent article - thanks for that.

    I have a ton of research papers going back 30+ years as PDFs, and years ago, I started using Copernic desktop search engine. It's still going - the IP has changed hands and the company seems a bit scummy these days, and their product doesn't really do anything new or better than it did 15 years ago, but I still use v6 on a daily basis.

    Google did something similar, and possibly better, but then lost interest, as Google is wont to do. Desktop search engines seemed to have a brief vogue, before everything went Cloud-y, but I still use Copernic to have a fast, local, searchable index of all my files on all my local hard drives and NAS.

    What WAS that paper with the neurons in flies where they used tdTomato and laser axotomy? "drosophila tdtomato axotomy" [limit filetype to PDF] will give me the hit in a few seconds.

    (Yes, I know Microsoft will index your filesystem, but it's shit. It doesn't index INSIDE files, as far as I am aware, and Christ, it's slow...)

    Doesn't work so well for images, unless you name your image files well (and I confess I lack the discipline to do this properly), but it indexes ALL my emails, Word docs, PDFs, textfiles and more, and gives me very fast hits. The index on my SSD is 3.5Gb in size, and Copernic took about 3seconds to find the image files from when I saw "Shonen Knife" back in 2017.

  8. Chris Evans

    Where's that tab?

    "All of those tabs are important. I just can't really remember what's open anywhere any more – only that it's open. Somewhere."

    For me I know the tab I'm looking for, is open in Firefox on my laptop but even when I remember part of the URL or title, Firefox's 'search tabs' frequently doesn't find it. I really should report it as a bug.

  9. Anonymous Coward
    Anonymous Coward

    ...And Then There's The Problem Of Old Application Software....

    Good article.

    And reasonably easy to keep LOTS of stuff from way back when. And reasonably easy to search through files by name or date stamp.

    But one increasing problem has to do with file FORMATS. I've got Wordstar files from 1983 which need some post-processing in order to be readable. Or Micrographx Designer graphics from 1990 which are not capable of rendering at all (as far as I can tell). Or DOC files from 1990 which MS Word won't open today. Or Visio files which my Linux machines cannot render......

    .....and so on......

    So.....in spite of keeping lots of stuff from 1982 onwards (you know.....live files, local backups, off-site backups).....some of it simply cannot be rendered and viewed as it was when it was created. I suggest that this problem can only get worse over time.

    1. doublelayer Silver badge

      Re: ...And Then There's The Problem Of Old Application Software....

      Yes, that will happen, but you have the option to retrieve their contents using a number of tools, including many open source tools for opening some file formats. If you cared about the content of those files, you could download Micrographx Designer, Microsoft Word from 1993, and Word Star, along with an old OS to run them on. With those, you can retrieve the content you're interested in. If you cared about the content, you could have done that at the time as well.

      1. that one in the corner Silver badge

        Re: ...And Then There's The Problem Of Old Application Software....

        > Yes, that will happen

        Gawd, I hate that attitude: the creators can't be bothered to keep their own software compatible and your response is to shrug your shoulders and lay the onus squarely on the user: "well, if you *cared* about it, you would have bought every update and made sure you spent countless hours loading and re-saving every file".

        And, of course, that should apply to *every* *single* *user*, no matter how techie they are - or aren't.

        The creators who simply can't be fagged to support their own file formats, not even to the point of, say, providing a way to dump to structured text (or just publish the damn format!) so that everyone has a fighting chance of recovering old files - those are really annoying.

        But that kind of "roll over and play dead for the pretty creators" attitude is simply loathesome. And that is when the attitude is pointed at other commentards! We Readers of The Register are the lucky ones, we are the ones that *know* creators will simply abandon us. To expect - nay, demand, else they clearly don't care - that understanding from all the Users out there is simply beyond the pale.

        1. doublelayer Silver badge

          Re: ...And Then There's The Problem Of Old Application Software....

          In some of those cases, the authors didn't decide to stop caring about the format, they stopped existing. But yes, it is your responsibility to export your data from formats if you decide you want to keep it in a modern one. You don't have to buy lots of new versions, because if you were able to generate the file, you can export it to a format supported by the version you already have. For example, you could open a file in the old version of Word and save the text as plain text, and you could probably have exported all of it as the pretty easily parsed RTF format. I am not asking anyone to buy new versions of anything, at the time or now, but I am asking them to go through the necessary steps if they decide they want to open old files in modern software. People posting here know perfectly well which formats are open or not and they knew that when they created those files.

          On that topic, we have your complaint about lack of backward compatibility in new versions. Yes, this annoys me, but not as much as it seems to annoy you. I do not particularly want to pay writers of modern software to implement compatibility with file formats that never show up. This is often involved in my choice of software or file format. If some software, for example, only supports a proprietary format, then I may decide not to use it. It is one reason why I dislike Apple's iWork set of Office software. While it supports exporting documents to formats that others can use, it will only let you edit files saved in Apple's proprietary formats. So I moved those icons away and used LibreOffice instead. People here know everything I knew when I made that decision, and I trust them to make a similar decision based on their own tolerance for having to convert files later.

          1. that one in the corner Silver badge

            Re: ...And Then There's The Problem Of Old Application Software....

            > I do not particularly want to pay writers of modern software to implement compatibility with file formats that never show up

            Neither do I, which is why I said:

            >> providing a way to dump to structured text (or just publish the damn format!)

            That is, when you are about to drop a file format, complete the job and dump the information you already have onto the public (especially easy since the rise of public repos - SourceForge to GitHub).

            Of course, you can argue that they never had that information to release in the first place - and I wouldn't doubt that many are in that position[1] - but then that gives you another data point to access whether you really want to ever again rely on someone with that level of competence.

            > Yes, this annoys me, but not as much as it seems to annoy you... People posting here know perfectly well which formats are open or not and they knew that when they created those files.

            As I've said before, "People posting here" are the minority, the Cognescenti, the Privileged Few - and lording your Privilege at the expense of all the other poor sods, the Ghastly Hoi Polloi who have just been dragged into this situation through no fault, and certainly no desire, of their own is the position that *actually* annoys me.

            [1] I have spent too many working days implementing dumps because other people in the same group haven't bothered to and, guess what, during code changes errors have crept into code used to write the data! Surprise! So we have to hunt down what those errors are (dump the data to text - and grep like your life depends upon it!).

            1. doublelayer Silver badge

              Re: ...And Then There's The Problem Of Old Application Software....

              Well, it annoys me when we treat those who don't have technical knowledge like they're unthinking morons at the whim of software writers. They have the option to learn how their equipment works, the same way that we did, and they don't get to avoid responsibility for doing that with anything else. For example, if one of them was recording analog video onto cassettes, they might want to know which format they were using, what players supported it, which televisions supported the video format their camera was writing to the tape, etc. That information was right there in the instructions, and the file format choices are right there in the save box. It does not take expert knowledge to try to use those things.

              The major benefit of computers is that, even if they chose a format they now regret, they have a pretty good chance of being able to recover it, even decades later.

              I have some schoolwork, on paper, stored somewhere at my parents' house. Well, actually I don't, because they threw it away. It probably wasn't in great condition when they did that. If I complained that I had lost my precious writing, the answer would not be to ask for paper that never degrades or that my parents run a great museum of my work, but to tell me that, if I want that paper to last, then I have to take care of it, store it safely, maybe make copies so it doesn't come down to the existence of one sheet, and don't leave the only copy in the care of people who don't find it worth anything. Had I cared about the paper, I would have done those things. Computer files are not different. Some effort is needed to keep it, and people do not always care enough to go to that effort.

    2. that one in the corner Silver badge

      Re: ...And Then There's The Problem Of Old Application Software....

      > I suggest that this problem can only get worse over time.

      Sadly, it will. And the impact that has is never considered by those responsible.

      Archivists are concerned about this and try to publicise the problems, for example Recommended File Formats for Digital Preservation, but not only us the fight against inertia and "that is your problem, not mine, why are we paying you archivists?" but look at the list and notice the lack of "whatever is Word's current format" under word processing. Now, we all know of the news stories where one body or announces that it will no longer use proprietary formats when Microsoft et al swoop in and spread the FUD (heck, look back over suggestions in Register comments that we'd be better off using Open/LibreOffice than MS Office and spot all the claims that that just isn't feasible, being polite about the arguments against).

      And that is just (!) when talking about Very Important Documents, ignoring all the data lying around which Society (and certainly Big Important People) considers trivial, but actually has a big impact on people's lives and is therefore actually worth preserving: digital entertainment, such as games. The death of Flash and the creation of the Flashpoint Archive is taken seriously by the Digital Preservation Coalition. Meanwhile, all sorts of bits of information are going missing because the people who *could* make it preservable, the software authors[1], consider it their right to be able to (effectively, and sometimes literally) chuck it away.

      "What", you say, "literally chuck our information away? Nobody does that!". Have you checked everything that is sitting on Someone Else's Computer?

      Records of interactions that you sometimes want a reminder of? Of course *you* religiously download your own copies of everything you might want to refer back to - but then you realise the email invoice just has the part number on it and you have all been relying on the online system's copy to have hyperlinked that part number to the full description page, until their sales system was Improved and all those links died: you still have the data but have just lost the means to interpret it, without which it is just so much noise.

      And in all of this, Register readers are the Privileged Ones: we know that They do these things to us and (hopefully) take the time and trouble to take appropriate measures.

      Meanwhile, the poor unsuspecting in The Sea Of Users out there are gambolling in the shallows, unaware that they merely on the top of a sandbank that can suddenly erode away beneath them.

      1. Richard 12 Silver badge

        Re: ...And Then There's The Problem Of Old Application Software....

        I am surprised to see "XML" in that list.

        XML is useless without the metadata about the meaning of the tags - most of which is never written down at all, let alone formally specified in another Level 1 format.

        And I'm shocked to see USD in that list. There's only one implementation of that format on the planet, and the documentation is incredibly obtuse.

        1. that one in the corner Silver badge

          Re: ...And Then There's The Problem Of Old Application Software....

          > I am surprised to see "XML" in that list...useless without the metadata...

          Indeed.

          Which illustrates that this area is still a work-in-progress.

          If enough of us, politely, pass on these observations to the people drawing up those lists, hopefully things can get better (although my contact in the DPC does a very good eyeroll when I get to ranty about the whole subject of metadata/DTDs/schema).

        2. Anonymous Coward
          Anonymous Coward

          Re: ...And Then There's The Problem Of Old Application Software....

          Mmm, some interesting ones from my POV that I've still got the scars from - TIFF (yes even TIFF) - we had to deal with a wave of old TIFF images generated by a (my memory is getting foggy after all these years) Kodak imaging component that had been bundled free with an old version of Windows and had a "non standard" compression method. When you go and read the spec. TIFF is a container for a whole lot of options that you can shove inside and still give it a .tif extension.

          Works perfectly and nobody's the wiser when the software's still around - all goes boom when you try and upgrade your Windows to a newer version which has dropped that support (that nobody knew there was a dependency on in the first place because... well, just because - herding cats).

          And then there were Microsoft Office Document Imaging and XPS - I could still rant at Olympic level about those.

          1. that one in the corner Silver badge
            Facepalm

            Re: ...And Then There's The Problem Of Old Application Software....

            > TIFF (yes even TIFF) ... TIFF is a container for a whole lot of options that you can shove inside and still give it a .tif extension.

            Aaaaaarrggggh TIFF!

            I have long, long been of the opinion that practically nobody who suggests anyone use TIFF can even say what it name expands out to, let alone what data formats it can contain! Tagged Image File Format - and even if Aldus (oops, Adobe) still play gate-keeper to the list of "known" tags that won't stop some clever-dick from reading and acting upon the Wikipedia article:

            > However, if there is little or no chance that TIFF files will escape a private environment, organizations and developers are encouraged to consider using TIFF tags in the "reusable" 65,000–65,535 range. There is no need to contact Adobe when using numbers in this range.

            And we all know what happens when you create data that isn't intended to "escape a private environment", especially if a salesman spots you using it..

            Even in the last year I recall seeing someone urge the use of TIFF "because it is uncompressed" - sob, bang head on desk.

            Sorry, rant over[1] - and sorry for that list scraping over your scars.

            [1] Go on, tell us about XPS, we know you want to :-)

  10. heyrick Silver badge

    Not because the search engine fails

    Yes, it does.

    Quite frequently.

    If it isn't a popular site or something either sponsored or stuffed with adverts, it's fifty fifty as to whether or not it exists. I have, in the past, tried an exact search (in quotes) for something I had an actual screenshot of, only for Google to tell me there was nothing found. Sometimes I can coax Google into giving me a link to what I'm looking for by entering more of the text without quotes and wading through the bullshit, but other times I give up and use Bing.

    Bing is lame too, but different to Google, so between them I stand a chance of getting somewhere.

    1. heyrick Silver badge

      Re: Not because the search engine fails

      I should add, one of the main reasons that I started using Bing was because when you're trying to look up part numbers of weird Chinese chips inside things, Google has a horrible spam problem. Follow a link that looks, from the description, that it knows what the chip is and suddenly I'm at a different site, often with a .it domain, that is trying to get me interested in naked people (probably worse but my content blocker killed it). I tried Bing and, actually found datasheets. In Chinese, of course, but some autotranslation got me the gist to know it was Yet Another 8051 Clone from the Middle Kingdom. Google? Had no idea, was being gamed, and didn't seem to care. I dunno, the quality of Google's search has nosedived in recent times. So, no, I'm not sure that "dump it in a big heap and let the search engine sort it out" is a viable proposal for anything.

      1. adam 40

        Re: Not because the search engine fails

        Thanks for the tip,

        now to find some Italian pr0n. All I need now is an obscure Chinese part number!

    2. Anonymous Coward
      Anonymous Coward

      Re: Not because the search engine fails

      Google's ability to find relevant stuff (as opposed to stuff for sale or as you say pages loaded with adverts) has dropped off remarkably in the past few years to the extent it's getting to be useless. I used to be able to enter a very niche term and it would come back with pages of links to sites of interest that I could trawl through for useful nuggets. Nowadays a few vaguely relevant links (mostly to business sites and wikipedia) and it gives up even trying - the old long tail just isn't even coming back any more.

  11. JoeCool Silver badge

    We can remember it for you wholesale

    (Just had to throw that out there)

    I've thought about this on and off for a while.

    It's why I have (or had) a download of Alta Vista Desktop.

    I've considered creating an AWS site and putting everything there, to see what Google can do.

    But what I think is needed is a "Digital assistant" that is invoked when the material is created, is aware of context, and that can create those associations. A form of Alexa ? an "AI" ?

    1. ecofeco Silver badge

      Re: We can remember it for you wholesale

      Excellent reference. Both title and AVS

  12. IceC0ld

    ok, TBH MY 'collection' of data isn't that large, a few terra bytes is all, mainly movies / music, and a fair old number of photos too, add in the proverbial word .docx and for me, the search is too long in the taking, BUT, I was always one to imagine the issue was going to be the format going out of date

    I HAD several files for AMIPRO, and LOTUS 123 hanging around, nothing major, just a collection of 'How To's' from a different age, doing a different job, which today, I am unable to access

    and that is what I was always a little bit worried about

    what DO we do once, for example, MP4 for my movies is hosed, how then are we expected to access our memories ?

    I KNOW that storage is getting cheaper, but that too has to have a limit, but as I said, it was always how do I access a file when the format for it has long gone ?

    TBH am glad I am not at the PETA / EXA scale of storage, because I DO understand just how large they are, and then, I too would be lost, as in not only able to read the file, but damn me if I could actually find it too :o)

    the future is weird and wonderful, I just hope is still has MY pictures and mamories available so I can bore the tits off the grandkids LOL

    1. doublelayer Silver badge

      "what DO we do once, for example, MP4 for my movies is hosed, how then are we expected to access our memories ?"

      Download ffmpeg and use it to convert that to whatever format you like instead. It's a marvelous collection of codecs for AV formats. In reality, though, you tend not to even need that unless the format is particularly obscure, because your media player software likely natively supports thirty formats or so, unless it's already implemented its format system by baking ffmpeg into it which is quite common. Software doesn't expire; it gets old. People stop using it because something better is available, but it's usually there for you to get your data out if you want to go to the effort to do it.

      When data is stuck in an old format, it's usually so unimportant that nobody's done anything to get at it, not locked away from them for lack of a tool. There are exceptions for particularly proprietary formats or ones specifically built to keep the user from accessing it outside the provider's DRM, but you know that when you first start using that format, and there's usually some option you can take when you have the software to keep at least some of it.

  13. Scott L. Burson

    You have HUNDREDS of browser tabs???

    Piker. I'm into the thousands.

    1. DexterWard

      Re: You have HUNDREDS of browser tabs???

      I don’t understand why you would have more than one or two tabs open at a time. Why not just bookmark stuff?

      I also don’t understand this obsession with hanging onto everything. Who cares if a 30 year old file isn’t readable any more?

      If the information is important, print it out. Paper will last much longer than any digital medium.

      1. imanidiot Silver badge

        Re: You have HUNDREDS of browser tabs???

        Not if you have silverfish in the house...

      2. that one in the corner Silver badge

        Re: You have HUNDREDS of browser tabs???

        > I don’t understand why you would have more than one or two tabs open at a time

        Right now, I'm working my way through talking to some sensors across I2C using Python - nothing World Changing, but I've not done it before. I have tabs open for the MCU specs, the sensor specs, the pin assignments for the dev board, the pin assignments that the software expects, the tutorial that I'm working through, the example code that tutorial is working towards, some more example code that relates more closely to the dev board I *actually* have, multiple pages in my personal Wiki so that I can quickly add a sentence - or a URL, with all the context that a "bookmark" totally fails to provide - about each item as and when I realise that would be useful. Plus a diary/logbook Wiki page into which I note down all the commands and actions I *actually* do whilst following the tutorial, because, hey, I'm fallible, I'll miss out a step by accident and need to backtrack. Oh, right now, I also have open The Register, as I'm taking a break, and sometimes a concert video in *another* tab, as I like a bit of music while I work.

        One or two tabs? Heck, I have these spread over multiple *windows*, both for convenient grouping (boring old specs sheets stay together) as well as to allow the logbook page to stay open on the portrait monitor.

        I guess I *could* keep closing and re-opening each tab, but that doesn't exactly make for a smooth workflow.

        None of this is meant to be boasting - as I said, I'm not doing anything groundbreaking here, just playing with a sensor (and, soon, blinky lights! Yay!) and, finally, finally, getting around to learning a tiny bit of Python, like the Cool Kids do. But hopefully it illustrates that having more than two tabs open *can* be productive. This is all the sort of thing that I do *now* to keep on top of stuff - and I *really* wish that I'd figured it out (and had the physical means to do it) e *lot* earlier, when I was still working, especially in the early days. So much stuff would not have been forgotten - and would not have had to be re-implemented purely because it had been forgotten.

        > Why not just bookmark stuff?

        Because the browser's bookmarks list is utterly dreadful for organising and providing context for the URLs; no, adding folders of bookmarks isn't The Answer: which folder does that really belong in? It is great for keeping track of a dozen or so items, like the *correct* URL to keep up with The Register, the easiest way to read XKCD or the homepage for remote-controlling the DVR[0], but the URL for the tutorial I mentioned above is, for my purposes, important in at least three distinct contexts (two for the hardware it uses - and one just for "this is what I did in 2023"!); expressing that in a bookmark list just won't fly[1].

        And by the time you've bookmarked a hundred items or more you are starting to spend time just searching for the URL in that list. Assuming you even remember that there is anything worth searching for.

        You could start to edit your bookmarks, removing things that, say, you haven't looked at in six months. I won't do that, because, if this sensor and lights thing works then the hardware will last for years - but I'll want all the records about how it works, why it *doesn't* do "something, when asked to "make it do something to be different last time".

        > Who cares if a 30 year old file isn’t readable any more?

        How about putting together a memory board (and in case you want to dismiss those, consider funerals and dementia) - and you are handed a file from a distant, and thoroughly non-techie, relative of the memorialised, who happened to remember he had this sitting in an old directory buried on his hard drive: it is all the notes from when we got together 30 years ago and first had the ideas which she surprised us all with when she turned them into that book, the one that made her World Famous in Richdale. That file obviously means bugger all to you, Dexter, but now they know it still exists, it would mean absolutely everything to that group of people.

        [0] grr - El Reg won't allow an href to 192.168.254.40 for that DVR homepage; it is obviously used to commentards trying to be "clever"!

        [1] which is why I use a little personal Wiki: I can add a note about *why* I've saved this URL. I can add it into lots of different places, I can reorganise at my leisure. Other solutions for organising URLs are available, even websites (which are just single-use Wikis, aren't they? With the advantage that you are sharing all that information about your interests with the website operator).

  14. clyde666

    relevance or interest

    Decades ago I was tasked with clearing out an old bank vault.

    Amongst all the usual accounts stuff, there were boxes of info going back to the start of the second world war. There was a trove of official documents detailing how the local bank branches were to organise to survive intense bombing.

    The gist was that every day copies of absolutely everything were to be made and sent by train to the next town up the road. Copies of everything. By hand. Now that was true 'back up procedures'.

    They in turn would receive the 'back ups' from another bank in the opposite direction.

    There was a large amount of detail written down.

    To my young eyes this was fascinating. Sadly I believe it was all burnt.

    So even if we put in lots of effort to save stuff now, how do we know how it will be treated in the future?

  15. Bebu Silver badge
    Windows

    Just let it go....

    Over the last few years I have come to accept trying to hold on to all this stuff is at best a distraction.

    To be honest I never did see the point of the literally 1000s of digital photographs stored in devices which could be write-only for all the differences it would make. When I state "If you cannot remember it, it never happened. When you have Alzheimers anybody's photo album would serve just as well." people get really huffy - funny that.

    As a gift to future historian - let the blighters work for it. If the current crop's revisionism is any indication the less factual material they possess the better they will like it. I guess people have always reconstructed the past to pander to their present.

    The more technical information will be of interest to industrial archaeologists but this stuff has always been lost over time. The secrets of Roman concrete I believe were just recently rediscovered and the Antikythera mechanism is still an enigma. I imagine if the bits of Babbage's Difference and Analytical Engines in the Science Museum were excavated two millennia hence they would equally be a puzzle. After two thousand years I suspect of our technology very little would survive save perhaps optical fiber. CRT tubes would but LCD screens probably not.

    When I think of the late C20 and the first two decades of the C21 I would worry the deep future having an accurate picture of this period might be tempted to load up their TARDISes and send them back to nuke the early C21.

    Although it strikes me that gathering all your files into a location where it constituted the (sole?) training set for a very large LLM might solves the navigation and location problem as I imagine the trained LLM would roughly mimic the brain's storage of such material.

    Eccl. 3:6 A time to get, and a time to lose; a time to keep, and a time to cast away;

    1. Dinanziame Silver badge
      Windows

      Re: Just let it go....

      My wife recently asked for pictures from 14 years ago. For a while, I thought we could only get them from a NAS that I disconnected 2-3 years ago for lack of use, but was still keeping in the basement. It turned out that I didn't have a power cable that would fit, and that I would have to buy one somewhere... But then she said that she had somehow posted the pictures to Facebook 8 years back, so we went back to look for that post and found that it was a link to a picasa album, now surviving in Google photos on the account of our son. So the NAS went back to the basement, still containing some unfinished research articles, my old collection of MP3s and possibly the code for my PhD thesis. We already got rid long ago of the server I bought when I was in university, which boasted a whole TB of disk, and which demonstrated its lack of usefulness by not being turned on for so long it wouldn't boot anymore.

    2. Francis Boyle

      Re: Just let it go....

      There's a quote attributed to all the usual suspect that goes something like 'I apologize for such a long letter - I didn't have time to write a short one.' That pretty much sums up my attitude to archiving – I simply don't have time to go through all my stuff trying to determine what I might want at some undetermined point in the future. It's always easier to just buy a bigger hard drive.

  16. LybsterRoy Silver badge

    As often there is a short SF story. Its about the collapse of a civilization caused by the loss of the index to the index to the index. Only problem is its so long ago I read it I've lost the name of the author and the title of the story. Re-reading all my short story collections to find it may take a while. Please stand by for an update.

  17. Rich 2 Silver badge

    Not a new problem

    I remember sometime back in the 80’s (maybe) there was a report published by Fujitsu (I think?) regarding data storage

    One of the conclusions of that report was that something like 70% of all stored data was effectively useless because it was either impossible to retrieve at all, or the cost and effort of retrieving it was not worth it

  18. davefitz
    Boffin

    re: Short SF story

    LybsterRoy, you're thinking of Hal Draper's "MS Fnd in a Lbry" (December 1961 F&SF) and since then winding up in the occasional cautionary tale of bibliomania. I guess the problem in the story is a failure to reconstruct the FAT for the unimaginably vast but physically tiny superstorage. I've rebuilt terabyte drives overnight with no problem; how long for a yottabyte or two?

    Slightly more seriously, Kurt Godel once suggested a data storage by multiplying data values by prime numbers and adding up to sum, then dividing by prime numbers to reconstruct the data set. Wouldn't it be more plausible to test multiple (as in, thousands) of short strings against a data set and keep the ones that approximate its total, then subtracting and repeating? Eventually, you'd have a fairly short stack of instructions, something like "this 20-digit string repeated 47,565 times, add this number to other 18-digit string repeated 13,430 times" and the sum would reconstruct the dataset. Godel's idea requires massive computing on both ends- my suggestion would take awesome power to compress but hardly any to reconstruct. Clearly it is wrongbaddumb or commenters would be doing it by now. Any takers?

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like