Oi!
"Sometimes it’s rhubarb crumble."
I'd far rather have that than the abomination that is bread-and-butter pudding (with or without added chocolate).
The UK’s National Archives in Kew have enough gems and hidden secrets to keep Indiana Jones or Robert Langdon in sequels for the next couple of centuries, with everything from the Domesday book to the UK’s official UFO records socked away safely in its sanctum. But for the swashbuckling archaeological hero of the future, whips …
Unfortunately, given how hardware and software changes over the years, it seems to me the only real way to make sure all this stuff is archived for future generations is to print it. We can read the Domesday book from a thousand years ago, because it's on paper*. We're having trouble reading the documents from the Olympics 16 months ago because they're in an electronic format that we weren't prepared for.
* or sheep skin or whatever it was back then.
Yes, but all that parchment was fearfully expensive. So you had acts of vandalism such as the works of Archimedes being erased and replaced with minor religious texts.
Centuries later, similar things happened with BBC videotapes of Hancock's Half Hour and Dr Who. The Sage of East Cheam turned into an un-person!
"Intelligent computers" (if that is not an oxymoron) should be able to sort out Word 2, Word Perfect 5, etc. Latex can be more difficult if there are dozens of 'include' files which have been 'lost'. Spreadsheets and databases may take a little longer. Does anyone really want those Powerpoint nonsenses? (Yes, if you are a serious archivist.) Then there are Abiword and Gnumeric, Amstrad Locoscript, ... And there is an awful lot of image-mode PDF lurking about.
given how hardware and software changes over the years, it seems to me the only real way to make sure all this stuff is archived for future generations is to print it.
That is why the UK government should madate the use of a properly specified document format such as .odt (Open Document Format). The use of proprietary or poorly documented formats such as that various changing Microsoft .doc/.docx should be outlawed (they are not properly documented in spite of Microsoft committing fraud by buying enough ISO voting countries to be awarded is ISO number).
The same should be applied to other storage mechanisms, eg: CAD & GIS - proprietary formats abound and the major vendors frequently change them so that users constantly need to buy new versions to be able to read what people send them.
But I attended a talk by the company responsible for running http://www.legislation.gov.uk/.
It was fascinating description of the problems of converting century year old documents into an XML frormat and then using that to create a sophisticated searchable database. Also they have to turn around new legislation quite quickly.
The result is a great public resource containing all present UK legislation going back to the earliest one still on record(1267 since you ask).
It also goes to show that not all government IT projects are disasters waiting to happen
Since its already been mentioned that for various reasons long term storage on digital media is going to be an issue, not least because of media degradation maybe we could go a slightly different route.
Since modern papers aren't much better than tapes & the like why not use metal storage? Something like a stainless steel cartridge vetsion of the old punch tapes or Jacquard cards & a high speed reader, after all most just sits there for years doing nowt.
The problem here is mirrored all over the world at the moment in all fields. Will there be anything left of 1990-onwards in 800 years? probably not, we are, bar buildings maybe producing so little that's tangible and what we do produce generally is designed to fail.. including the tape drives and other devices Kew needs to use..
We might be in a latter historians eye - the mysterious denizens of Dark Ages II: The Search for a .DOCX converter... kinda sad if so..
"You can read most of the content in a DOCX file with notepad if you wish to."
Maybe so if you have a pre-historic binary computer to hand. Here in the 69th century we don't, we do conveniently have a time machine, but it's use is restricted to making frivolous comments on El'Reg, not historical research.
"Since modern papers aren't much better than tapes & the like why not use metal storage?"
Paper is fine. If it's acid-free archive stock, not the bleached rubbish most people feed through their printers.
Plenty of paper records have survived through history because they were made of substantial paperstock (they couldn't make it any thinner with contemporary technology), and the surviving ones are the examples here paper makers struck on a pH neutral/slightly alkaline formula, giving the paper a stable chemistry (whether they realised it or not at the time). All covered by ISO9706. One manufacturer claims a 200year guarantee (provided it's stored properly, not in the bathroom), which I think shows remarkable faith in their business prospects! We can rest easy knowing our descendants will be able to sue their descendants if our archive crumbles to dust in a mere 195 years...
That said, a ream of that stuff costs £20+, so a substantial markup on normal 80gsm office fodder. You wouldn't want to go printing the Internet on it...
We should be archiving it on the moon, so it survives any Earth based disaster that's small enough to leave the moon's surface unharmed. eg. climate change, many levels of asteroid impact, supervolcano, gamma ray burst, etc. Even if all life on Earth is wiped out, the sum of our knowledge will remain available for any life that comes along later, either from elsewhere or starting from scratch locally.
I propose a system that starts with extremely large scale symbols, visible to the naked eye from the surface of the Earth - enough to inspire curiosity, so as to encourage further investigation. Then smaller symbols, still large enough to be seen with primitive optics, which explain some basic science. As concepts get more advanced, the symbols can be smaller, as they'll have already explained how to develop better telescopes. Smaller symbols can be duplicated over more of the moon's surface to allow for redundancy. At the stage where rocketry, orbital mechanics, etc. are sufficiently explained, everything else can be stored in a bunch of duplicated vaults - readily available for direct physical examination.
Large symbols can be written with nuclear weapons and smaller ones with orbital lasers and rovers.
/bosh
"And what language do you propose we write it in? Hieroglyphs?"
A big fuck-off barcode, which will require the construction of a big fuck-off barcode reader, which will emit a big fuck-off beep when future generations scan the moon.
And then a really big fuck-off "unexpected item in bagging area".
"They probably slap 'copyright Google' over all the digitised data and then charge us a fortune to access their copyright data."
Scarily, that would almost certainly be legal for them to do without a carefully worded contract. The originals would still be available to personal visitors, but Googles derivative work (scanned + maybe processed) would be their own copyright.
I recall obtaining what was left of my grandfather's military record from the National Archive. Much had gone up in flames in the Blitz, but there was still his medal card. giving his service number, and that let me find when his Military Medal was gazetted.
It's all very well to talk about fire protection and floods, and there's the implication that data on computer systems can be backed up, but, while we don't need to worry about the Luftwaffe any more, does it really make sense for the Dark Archive to be the only copy of the data? Maybe we don't need to know where the back-ups are, but I hope there is an off-site back-up.
Thanks for the article and comments. Just a couple of responses to comments. TNA has offsite back up somewhere in England. I'm not sure printing to paper is helpful - TNA has almost as many pages of digital data as it does physical so a huge storage problem. Also many IT formats such as websites are hard to print in a meaningful way. As for metal storage, TNA has some 1940s audio recordings on metal tape. Can we read them? No we can't.
And pubs - try the Express just across Kew Bridge.
David Thomas
A warehouse full of microfilm, plus some instructions on building a reader seems to my little head to be the best 'backup' strategy of still being able to read a document even if 'digital' file formats change. Ultimately if future generations can manage to build a microscope/magnifying glass and a light source, the data should be reconstructible.
Balanced against that is the chance of anyone bothering to read it - so just as effectively "lost"
Having information like the census data online means it can be data mined to research patterns of immigration, social mobility etc etc. Having it on 1000s of microfilms in different places is more secure but means the data will almost certainly never be accessed or used.
Wonderful article about a national institution we can be proud of - an example of what the civil service can do when Ministers don't interfere. I wish it happened more often. Sadly, inevitably, some ministerial berk will set targets for making all parchments digital by default, scanning them at a rate of 500 pages a day, and then shredding all the 'hard copy'.
Suffice to say, it is most likely that all of the server infrastructure is easily replaceable these days and just as importantly the software configurations of said servers and repeatable, no one roles out serious amounts of infrastructure without configuration management tools these days.
The article mentions a little device called a StorageTek SL3000, a little reading would tell you that this device enables them to write to many diverse types of tape media, and have many copies on each type of media. It also allows you to export tapes, which can then be sent off site for later retrieval. Not to mention that it has facilities to constantly check and evaluate the use and wear of the tapes, drives and robots, with intelligent migration. So I would expect that TNA are making use of such facilities to make sure their archive is available for the foreseeable future under any circumstance.
Regards putting things onto Microfilm, that is ludicrous as it often suffers from the same or worse degradation problems than paper, several large archives have announced projects to migrate microfilm to digitised records.
"The catalogue itself began to move online in 1998, the first national archive in the world to do so."
The United States National Archives and Records Administration had begun to move its catalog online well before 1998. See this entry from the Internet Archive from June 1997:
https://web.archive.org/web/19970606073326/http://www.nara.gov/nara/nail.html