Could've spent that money on hiring QA staff
Just sayin'.
Microsoft's GitHub on Thursday said that earlier this month it successfully deposited a snapshot of recently active GitHub public code repositories to an underground vault on the Norwegian archipelago of Svalbard. GitHub captured every repo with at least one star and any commits dating back a year from February 2, 2020, and …
"At the time, David Rosenthal, a veteran of Sun Microsystems and Nvidia and the co-creator of Stanford's LOCKSS [Lots Of Copies Keeps Stuff Safe] digital preservation program, expressed skepticism that anyone beyond the current generation will ever find the code useful."
Future generations will be interested, especially in the more mundane examples. They probably won't use it it in production, but it will still be of historical interest.
I am trying to imagine a post apocalyptic world in which, e.g., power stations survived, networking infrastructures remained useful, and enough computers remained intact. This would suggest to me a lot of other infrastructure of developed societies also remained intact. I can't reconcile all that with a need for the deep frozen GitHub. Similarly if there were a "need" for the deep frozen GitHub, what would it be needed for? I don't think 7nm line width fab would be a priority in a world destroyed that comprehensively.
In this new society, culture will need to be remembered or rebuilt. The only "comedy" media that survived were a scratched copy of the Jim Davidson DVD "Sinderella Comes Again: Live" and an overly-compressed torrented copy of the first four seasons of The Big Bang Theory on a thumbdrive that was inexplicably discovered inside a fridge along with seventeen sealed tins of baked beans, of every brand available for purchase in the United Kingdom of Greater London and Surrey at the time of The Event.
The wellspring of Comedy 2.0 will be some of the particularly piss-poor code found in the archive, and comments containing the words "this should never happen" or "downloads the file at the URL using the HTTP client" above an HttpClient.DownloadFile(url); method call.
Mostlyt self-agrandisement, with hints of appearing to look out for user assets.
Look how important and great we are! Look how far we'll go to secure your data!
Copying the data to tape (as long as any needed binaries aren't more than 100KB) and sticking it all in a deep hole is far easier than setting up globally and regionally redudant, resilient, fail-over capable storage with effective disaster recovery.
Why do people always assume there has to be some kind of apocalypse for these kinds of projects to relevant? Throughout human history, civilisations have risen and fallen, technologies have been forgotten and rediscovered, but at no point has there been any world-ending apocalypse that wiped out our entire technology base and sent us back to the stone age. And yet we still find digging up artifacts from 1000 years ago to be interesting and informative, because while civilisation as a whole has continued uninterrupted, the details of exactly what was going on back then have been lost.
It's silly to criticise things like this because cavemen wouldn't be able to understand it, or people trying to rebuild civilisation wouldn't be able to make use of it. Assuming it actually survives and is looked at in the future, by far the most likely scenario would be archaeologists with a higher overall technology base but lacking the actual details of how all this stuff works. In that case, a deliberately preserved archive is going to be far more useful than occasional scattered references found from digging through landfills, which is essentially what archaeologists have to go on today.
Obviously if you don't find it worthwhile to preserve a bunch of Github code for future generations then it's still going to be a wasted effort. But that's a very different argument than simply dismissing it because it wouldn't be useful to cavemen after a nuclear apocalypse, which is simply not a scenario it's intended for in the first place.
I would recommend you to read Isaac Asimov's Nightfall. It's frightening and depressing at the same time.
As for Microsoft, I'll have to dispose of Flight Simulator CD because when my Windows XP PC will die, there will be no OS that could run the installer. 10000 years ? Surely you can't be serious!
It's basically Gentoo everything. First, you retrieve the source for an operating system, Linux for example. This needs various libraries, so you find those too. These need to be compiled, so you retrieve a C compiler. Then you realize that you don't have anything to run on and the compiler's also written in C. Then, you write your own language and compiler for whatever computer you have found, or you use whatever programming language is on the surviving machine available. So basically it would only be useful in a very weird catastrophe. Maybe we should have someone write a book called "How to build a computer out of rocks that knows how to execute some instruction set we designed for computers built with lasers" and put that in the archive too.
following an apocalyptic crisis, people will get out from their underground caves, grab their Windows PC and login to Microsoft acco... Oh, crap! License expired centuries ago. Let's send smoke signals to the support team. In the mean time, let's cut some trees to build a raft and send it to that darn island. We need that damn JS code by tomorrow.
Luckily it's all JS code because virtually nobody these days can read FORTRAN or APL, but in 500 years people will still be coding in JavaScript?
Hopefully someone has added a programming Rosetta Stone to the collection, wait, let me pull some punched cards out of my coat pocket and add them to the archives.
I met an engineer with an antique plan
Who said, “Two vast and trunkless archives of Git
Sat on the backup drive. Near it, on the wall,
Half torn, a readme printout lies, whose text,
And wrinkled paper, and sneer of cold command-line,
Tell that its author well those manuals read
Which yet survive, stamped on these pointless things,
The hand that typed them, and the drive that sped;
And on the printout, these words appear:
My Github is 0zym4nd145, Coder of Coders;
Look on my l33t Works, ye Mighty, and despair!
Nothing beside remains. Round the decay
Of that colossal disk store, crashed and burnt,
The lone and level tunnel floors stretch far away."
Yes. Because, a thousand years from now, people will know to go to Svalbard and dig up a number of reels that hold code that was written by IT neanderthals.
Honestly, in a thousand years, if anyone wants to actually consult this code, they'll likely need to rebuild the tape readers from scratch.
Then they'll find out that the tape has decomposed beyond its ability to retain the data.
Well done everyone. Great idea to use magnetic tape instead of optical discs. At least optical would likely last longer and wouldn't be subject to any modification of the position of the magnetic North Pole.
But a bit less daft might be to translate things like the Foxfire books
https://en.wikipedia.org/wiki/Foxfire_(magazine)#Books
to something like Ikea directions, print them on similarly robust media, and "plant" them in a number of places (physical LOCKSS archives).
Not that it will help when the answer to Life, the Universe, and Everything requires a .DLL (or .so) that somehow was missed.
_maybe_ this time people will have become less of a laughing stock in the universe, but I wouldn't bet on it.
Of course, it would require that the Disney lawyers haven't managed to extend copyright to "life of the solar system plus 70 years".
What we need to keep for posterity is knowledge, proven facts and plausible theories.
Code... Pha!! i spit!! "If not then this, else this" bullshit. who the fucks has the inclination to sift through that clusterfuck in 30.000 years? Code is is worthless in it self. The logic behind it is better expressed outside of code, and might not be worthless, AND might be worth preserving for posterity.
Haha, the very nature of "modern" web apps dragging in dependencies from NPM, PIP, crates.io, CPAN, etc... ensures that the bits of code in the vault will be incomplete and worthless.
These dependencies rack up so much technical debt that these projects wont be buildable in 5 years, let alone 500+. Chuck in a bit of x86_64 Linux Hypervisor Docker images and it pretty much ensures the solution cannot run on the processors of the future.
to be interesting.
Historians of the future will no doubt pore over this set of source listings, gleaning knowledge and theories about the lives of the programmers, managers and users of each project.
It's no different to how archaeologists of today carefully sift through the baked, buried and shattered remains of clay tablets that were used to record daily transactions - originally intended to be wiped each day.
From that, we know what people ate, drank, how much they paid (or were paid), some idea of the debts people commonly had and much more.
There's a few thousand lines of my code and comments in there. It's unlikely anyone will read it, but if they do, I hope they learn something about the way we lived last year.
If I was going to put 21 TB of data somewhere for the benefit of historians, it wouldn't be code, or at least relatively little of it would be. Code may tell some how a few of us thought, but it doesn't show much about how we lived except for the readme files. Similarly, if there are translation files in there it might help as a sort of rosetta stone, but that's getting to the goal by quite an inefficient path. A lot of code will look like all the rest of code, moving data chunks around. It won't help historians very much to have driver code for fifty open source hardware platforms that no longer exist. Here's what I would include instead:
Translations of various texts into most languages, trying to ensure that most subjects are covered (technical, legal, scientific, narrative story, and the most important basic description of something likely to continue to exist later on such as the water cycle). This helps with the inevitable language problem.
Dictionaries of all the languages we've included, which helps with extra words when they've figured out the basics.
Books on geography and astronomy, which help clarify what the planet was like when we were around.
Textbooks for most subjects at various educational levels which provide a summary of what we knew or at least what we thought we knew.
Descriptions written of everyday life by people who have been instructed to provide every detail, and most likely to ensure this, describing the life of people who live quite differently to the describer.
And, since I've probably missed several important things, let's just throw in the entire contents of Wikipedia in there.
There's my suggestion, and that probably fits just fine in a single terabyte; at least text-only Wikipedia certainly does and that's probably the largest chunk in the set. It's not perfect by any means, but if I had to figure out what life was like a thousand years ago, I'd rather have had their encyclopedias than a library written in an invented language that reads from devices implementing an arbitrary communications protocol to read chips with another arbitrary protocol.
Don't forget art and music. The music should be both in the form of musical notation (when appropriate), uncompressed sound files, and at least part of it should be analogue recordings (like LP discs engraved on some ultra-durable material), so it can be played without having to reinvent advanced digital technology.
Else we get the situation described in one classic science fiction short story: isolated colonists on a distant planet had an extremely rosy view of human culture, because they were supplied with glowing encyclopedia descriptions of art and music, not examples of the art itself.
Back in the days when digital was which finger was used to move the beads on the abaci a short post apocalyptic story where visiting aliens come upon a site that was that was deemed by them as a holy site. In the investigation of these relics a flat circular metal container is found that rattles. The science staff after many checks and tests find the source of the rattle is an inner spool of some type. The challenge of what was on the spool is met and at an all-hands it is demonstrated. There is much discussion as to the meaning of what they and what they significance of the end where one of the characters pops into view and symbols "that's all folks' appears
Could not it pass up since the description of the storage location matches the story and a shaggy dog was needed to break the serious mood.