Backup - ever heard of it?
Why is this a surprise to people? Why would you ever store something important only on someone elses system?
Always have backups of anything that's important to you.
Game designer Jason Rohrer has had a bad week, discovering that his 23 code repositories representing 15 years of development and community contributions were wiped from GitHub. "I can't believe how easy it apparently is to have someone's life work taken down from Github," he said on his forum, fortunately hosted elsewhere. " …
The problem here that was pointed out is not the GIT repository as that ought to be local as well (so at least a 2nd copy, if not more should your team or in-use computers number more than 1), but that a lot of discussions and bug-tracking are only held on the github server and (I presume) lack any way to mirror that aspect locally. Something to be fixed?
Why would you ever store something important only on someone elses system?
Especially if that is provided as a free service. In the absence of an exchange of cold, hard cash and a contract with an SLA attached, I will always assume the the service provider could pull the rug from under my feet with no notice and no comeback.
You (don't) pay your money, you take your choice.
> because git is designed to be distributed, there isn't really a 'master' and 'copy'.
Usually teams of developers do not sync directly between each other. In practice some server or servers is effectively the master, and developers pull and push code to it. And Github of course provides such a git server + added tools like the bug tracking and a web page that you can use as the staring point for getting into a project. But it is really cool that if Github disappears or "goes evil", the developers still have between them both the latest code version and the full history, and can easily reconstruct the "master" on some other server.
I'm not particularly a fan of Git, but this resiliency feature is impressive, and compensates for its defects (like the user-friendliness of a rattlesnake).
Well, yes. But as I understand it, the original reason for git was for collaboration. If one is actually using git in order that multiple folks can work on the code with minimal confusion, aren't backups likely to be somewhat less than complete and current at times? (Caveat -- I know nothing about git other than what I read on the Internet. RCS seems to be adequate for my minimal needs).
It is not possible to back up things like GH's issue tracking. And, almost certainly it will never be possible: if it were possible you could use those backups to seed your own or a competitor's issue-tracking system, thus removing most of GH's competitive advantage.
I take back some of the above: it looks as if it is possible to back up GH repos including metadata like issues &c. However tools to do this seem to be scarce. BackHub does cloud-to-cloud backups which is (a bit) better than nothing, but it costs money.
If anyone knows of a decent tool to do this to a local archive, I'd be interested.
Plenty of 'aged' accounts - i.e. active accounts for years, for various meanings of 'active' get hacked / have credentials dumped, often long after the user who was using them has forgotten about the account too.
To be fair to github - they restored the account, seemingly without any kind of loss to the user other than some time. He's learnt a lesson (don't rely on free services for something so important) and so I'm not really sure why this is even in the news?
"It vanished with no notice and no explanation, and was gone for at least some hours. That could be critical in some circumstances, and could easily be expensive."
Yes, but..... He was relying on a service that isn't relying on him to stay in business or even turn a profit so they have zero worry about dropping his data down the chute. Whoops, so sorry, our bad.
The aerospace company I worked at was abusing an SVN system for all of the data we generated which meant nearly everybody in the engineering office had a copy of the repository on their computer. If that server went down, it would have been a pain, but we had data backed up all over the place and could have made do until the server was back up again even if it took a week. The bottom line is that the company could have thrown money at the problem to make it go away. A system such as GitHub doesn't allow for using money to make a problem go away. If the outage was widespread, you wouldn't even be able to find out WTH was going on and if there were an ETA on a fix.
There should always be a complete backup and a way to use that backup. If you use an online software package and the provider flips on it's back and twitches its legs, even if you have your data backed up, you don't have a way to use it. That makes me avoid the "software as a service" crowd if I can't get an "evaluation" version of the software just in case.
Yes, but..... He was relying on a service that isn't relying on him to stay in business or even turn a profit so they have zero worry about dropping his data down the chute. Whoops, so sorry, our bad.
Um. How long do you think GitHub would last if they started losing the data of their users' projects regularly? In fact it turns out that they have quite a strong reason not to do that: staying in business.
I agree with you about the risks associated with software-as-a-service. You can in fact run in-house GitHub instances on your own tin, and if I was working for a big enough organisation (and wanted their added-value stuff over git) I'd be doing that.
"and could easily be expensive."
Hint: free service. "downtime Could be Expensive" == you need to pay for it.
At least he wasn't like the stock market daytraders who actually attempted to sue operators of an IRC network when it went down.
As others have said the critical part is the bugtracking stuff. If you have any sense you mirror that periodically (The "how" part is an exercise for the reader)
"He's learnt a lesson (don't rely on free services for something so important) and so I'm not really sure why this is even in the news?"
I guess so people who aren't him can learn the same lesson? I know that it often takes damage to oneself personally to learn the following two lessons, but hopefully one or two people are able to learn from others' mistakes:
1) Always have a backup;
2) Never rely on companies that offer to host your data online, especially for free.
It's not a cloud, it's just someone else's computer.
At least if it's a paid service you might have some grounds to complain when things go wrong, although that still doesn't absolve you from not having a backup plan. If you're not paying, you should assume that everything could disappear at any time. That's not a possibility you should want associated with terms like "life's work" or "business critical".
Quite. If this is all the games he has ever developed then, quite frankly, he should have the original masters backed up elsewhere. Even BitBucket or GitLab would be a start.
Having someone complain about lack of backup in this day and age really just shows their (lack of) aptitude IMO.
The issues seem to have been not the absence of a backup for the main data but the absence of a backup mechanism for the bug-tracker and the fact that GitHub was being used as part of the workflow to manage multiple servers. As others have said, if it's that important, particular for the workflow issue, a paid service with an SLA sounds reasonable.
"Having someone complain about lack of backup in this day and age"
I'm just dealing with the aftermath of an incident where someone decided that using scratch disk space on a server - an area that's explicitly NOT backed up (and has warning messages to that effect) - was a good place to put critical archival data instead of on the central fileservers.
Server got rebuilt, scratch space got reformatted, 2 weeks later user screams the roof off the building when he realises all his precious algorithms are goneski - then demands we send the drives out to a recovery company. Um Hello? They're been scribbled over for the last 2 weeks, what do you think that's going to achieve?
Anonymous, because I have to work with the idiots responsible for that kind of clusterfuckage. What kind of fumduck thinks that a directory called "scratchpad" is a good place to keep critical data?
I don't see that it matters that much. As long as you have a backup, you can get some benefits from using the cloud as the primary. For example, I run my website on a cloud service because it's not that important and I don't need that server in my house. Also, the electricity people around here aren't great about getting to my house when their line fails until after several hours. With that said, a server located off site and where the provider handles the power and network means it's less likely to go down. If they should delete my account, I have all the files I need right here to restore it.
irony acknowledged. heh.
So, the *FIRST* major change to github is a bot that FAILS to "get it right" with respect to spam filtering, punishing the honest/innocent via brain-damaged AI algorithms, while GROSSLY MISSING the 'bulk' of the problem at the exact same time.
Sounds like hotmail or anything ELSE that MS "took over". I'll still use it, I suppose. yay.
(this is probably comment #43 - oh well, so much for having 42 of them. I ruined it.)
No one should ever trust Microsoft with any of their precious data. (Amazon is much the same).
They can and will use it for their own requirements. Why else did they buy GitHub? Out of the goodness of their heart naturally...
The day after that deal was announced I moved all my code off of GitHub. It will never be going back.
I am determined to remain uncontaminated by MS (I do accept that they contribute to Linux but they have to make their submissions GPL compliant) for as long as possible.
Downvote me all you like but while I have a choice NOT to use Azure or Github or Orifice then I will do that. The same goes for AWS. You are at their mercy as this article clearly shows. IF you step out of line, you are gone gone and gone. All that lovely work gone.
Don't forget the most dodgy of them all. The disappearing privacy switches. You turn off the fact you don't like being tracked, Microsoft decide they will just switch them all back on again down the road, once they think you have forgotten.
How else do they fund a "free" Windows 10...
Mine has reset at least 3 times since Windows 10 launched. Along with stupid Candy Crush re-installing itself by magic... What's weird is the EU don't seem to care about this, since they got on the Microsoft pay-roll (Munich Office/Windows deal anyone???)
"AI" or any other kind of Machine Police should never be allowed to de-activate accounts. It's welcome to flag them up, down and sideways internally for human review all it wants, and the corp behind it is welcome to improve it until its human mods can cope with the volume of flagging it generates, or hire more of them. But "AI" should never have access to the Big Red Button. If a hacker or IP thief causes an outage or disruption of any length whatsoever they're immediately charged with causing eleventy trillion billion million dollars of "damages" - how come corps are allowed to get away with doing exactly the same without having to have anyone actually accountable for it?!? No, "because you've agreed to it" is not a valid answer. And neither is "assume guilt automatically and shoot deactivate by default, ask questions later reactivate only if and when the Twitter shitstorm hits, then apologize for the mistake of having inconvenienced someone of high enough profile".
AI should never be able to *permanently* delete accounts. However, when you number your accounts in the hundreds of thousands or millions, you have to have some automatic disabling. Then the rare false positive can be manually corrected.
Of course, if you have a lot of false positives, you have a different issue and should tune your algorithm better before you give it teeth
False positive fine but if that had been an account more than a week how the hell can it be spam since it would have been reported long ago and not just them. Hell if the account was years old why would they scan it at all unless they had a letter from a TRUSTED source not just an email from a random idiot no one has ever heard of. I think this is their way of clearing space by disabling stuff and seeing if someone complains about it. Stupid sure but it's and MS company now so this seems plausible.
We tried to set up a Twitter account to promote our open source software, but within about 20 minutes of struggling to get our logo to display in the avatar (they'd automatically cropped it) by retrying the upload several times, the account was locked, and the "help" line never responded to our complaint other than with boiler plate emails.
We had to abandon the account, never used and still locked because some silly machine considered we were abusing the service by trying to get it to display our logo, and no human being was even contactable to fix the problem.
Given he says he has several nodes pulling from his account automatically, its likely some sort of automaton at github decided this was a command and control channel for some sort of bot, and pulled it down, possibly 'pending review' -- and they did review and restored it. I can understand that easily enough...
He might be better of having a VM somewhere to so that sort of schemes, he'll have official sources on github, and mirror on the VM, and the automated nodes pulling from there. Extra backup layer thrown in.
Huhn seriously? more than complete amateurs would put a third party company in the loop FOR NO REASON WHATSOEVER for their devops needs? It's not like they need the storage or anything, they could EASILY run a VPS and not have that single point of failure.
It's not 'account pulling from them' here, it's 'machine pulling from a repo, kinda like... a virus would'. I can understand automating stuff, but using what is basically a cloud server for 'half the internet'?
Giggles.
Maybe hosted services are more reliable than a random desktop. But not more reliable than a couple of proper backup systems.
Github is a distribution system, not an archive. Obviously so, since it's Someone Else's Computer.
Losing your github archive should be about as exciting as losing your web page : a minor annoyance requiring a few minute's work to correct.
If it's not, you're doing it wrong.
That's the power of a bash shell... rm -R is beautiful bitch.
It's easy enough to git clone back to the repo from your backup though right?
Please don't tell me you trust a third party to be your sole back up solution.
Backup, backup and backup. Especially anything remotely to do with the cloud,
Don't assume you understand backups either. Having one USB HDD connected most of the time that you copy stuff to is not a backup solution.
!
Loads of folk here know how to do it. But I bet loads more are vulnerable to a fire, flood, ransom ware, the cat deleting it, you deleting it.
Really Source Forge, Github, Google Docs, Dropbox, Office 360 etc should only be used for temporary collaboration and mirrored distribution servers etc used for delivering to the Public. Your own in house system (or securely hosted accessed by VPN only) for in house distribution. All sources, final runtimes, documentation, discussions, decisions should be backed up according to best practices. Online storage or Flash is not a backup solution.
Once and for all: no, you don't, unless you consciously arrange to do so. If you maintain a local git repo by an initial clone & then pulls what you have locally is the commits which are in the ancestry of your remote tracking branches: you don't have commits which aren't.
Of course you also don't have all the information which is not in the master repo at all such as all the issues &c. That information might matter if you care at all about what bugs your code has, what your future plans are &c &c.
Exactly the same thing happened to me on SourceForge last year. Account nuked, no email, no communication, nothing. My repositories all still existed but my username had been changed to '<REDACTED>'. Luckily I'd already migrated everything off to GitHub at this point but I still used some of the mailing lists (which bizarrely continued to work).
I emailed them, got nothing, then complained on twitter and someone finally replied to the email claiming it had been 'overlooked'. Apaparently an antispam bot nuked it, just like with GitHub. They did restore my account and, with a bit of pushing, changed the join-up date to 2000 so I retained my seniority, but were unable to update the repositories so I was listed as the author of the commits.
I never received any kind of apology --- not even a pro forma 'sorry to hear that'.
Needless to say, I don't feel inclined to use SourceForge for anything much, and I now have a backup script which periodically backs up the raw repositories from GitHub to bluray.
GitLab is another way to go if you're worried about storing your projects (and bug tracking/discussions) on someone else's servers. You can install a free self-hosted version that has most (but not all) of the features of the paid hosted version and keep all your data local.
Nothing stopping you opening up that "local" GitLab to the wider public, though you obviously have to take some decent security measures (keep up to date with the monthly releases, enable 2FA, use a secure cert, manually vet new user creation).
A GitHub bug could have been exploited earlier this year by connected third-party apps to hijack victims' source-code repositories.
For almost a week in late February and early March, rogue applications could have generated scoped installation tokens with elevated permissions, allowing them to gain otherwise unauthorized write or administrative access to developers' repos. For example, if an app was granted read-only access to an organization or individual's code repo, the app could effortlessly escalate that to read-write access.
This security blunder has since been addressed and before any miscreants abused the flaw to, for instance, alter code and steal secrets and credentials, according to Microsoft's GitHub, which assured The Register it's "committed to investigating reported security issues."
On December 15, Microsoft's GitHub plans to turn out the lights on Atom, its open-source text editor that has inspired and influenced widely used commercial apps, such as Microsoft Visual Studio Code, Slack, and GitHub Desktop.
The social code biz said it's doing so to focus on cloud-based software.
"While that goal of growing the software creator community remains, we’ve decided to retire Atom in order to further our commitment to bringing fast and reliable software development to the cloud via Microsoft Visual Studio Code and GitHub Codespaces," GitHub explained on Wednesday.
Microsoft's GitHub on Tuesday released its Copilot AI programming assistance tool into the wild after a year-long free technical trial.
And now that GitHub Copilot is generally available, developers will have to start paying for it.
Or most of them will. Verified students and maintainers of popular open-source projects may continue using Copilot at no charge.
Slowly but surely, software package registries are adopting multi-factor authentication (MFA) to reduce the risk of hijacked accounts, a source of potential software supply chain attacks.
This week, RubyGems, the package registry serving the Ruby development community, said it has begun showing warnings through its command line tool to those maintainers of the hundred most popular RubyGems packages who have failed to adopt MFA.
"Account takeovers are the second most common attack on software supply chains," explained Betty Li, a member of the Ruby community and senior front end developer at Shopify, in a blog post. "The countermeasure against this type of attack is simple: enabling MFA. Doing so can prevent 99.9 percent of account takeover attacks."
GitHub has revealed it stored a "number of plaintext user credentials for the npm registry" in internal logs following the integration of the JavaScript package registry into GitHub's logging systems.
The information came to light when the company today published the results of its investigation into April's unrelated OAuth token theft attack, where it described how an attacker grabbed data including the details of approximately 100,000 npm users.
The code shack went on to assure users that the relevant log files had not been leaked in any data breach; that it had improved the log cleanup; and that it removed the logs in question "prior to the attack on npm."
Following the recent disclosure of a technique for hijacking certain NPM packages, security engineer Danish Tariq has proposed a defensive strategy for those looking to assess whether their web apps include dependencies tied to subvertable email domains.
NPM, acquired by Microsoft's GitHub in March 2020, operates the NPM Registry, an online repository of code libraries that web developers include in their applications. It currently hosts almost two million packages and serves more than 174 billion downloads per month.
The attack described earlier this month by security consultant Lance Vick involves identifying NPM packages managed by email accounts tied to expired domains. By registering the expired domain, the attacker then gains control of any email addresses associated with that domain.
Special report Security consultant Lance Vick recently acquired the expired domain used by the maintainer of a widely used NPM package to remind the JavaScript community that the NPM Registry still hasn't implemented adequate security.
"I just noticed 'foreach' on NPM is controlled by a single maintainer," wrote Vick in a Twitter post on Monday. "I also noticed they let their domain expire, so I bought it before someone else did. I now control 'foreach' on npm, and the 36,826 projects that depend on it."
That's not quite the full story – he probably could have taken control but didn't. Vick acquired the lapsed domain that had been used by the maintainer to create an NPM account and is associated with the "foreach" package on NPM. But he said he didn't follow through with resetting the password on the email account tied to the "foreach" package, which is fetched nearly six million times a week.
GitHub has announced that it will require two factor authentication for users who contribute code on its service.
"The software supply chain starts with the developer," wrote GitHub chief security officer Mike Hanley on the company blog. "Developer accounts are frequent targets for social engineering and account takeover, and protecting developers from these types of attacks is the first and most critical step toward securing the supply chain."
Readers will doubtless recall that attacks on development supply chains have recently proven extremely nasty. Exhibit A: the Russian operatives that slipped malware into SolarWinds' Orion monitoring tool. That malware made it into over 18,000 companies, around 100 of which were infected and attacked. GitHub has also had its own problems, such as when access to npm was compromised.
Analysis GitHub says it has identified and alerted developers who have had their private repositories accessed and downloaded via stolen authentication tokens.
In this multifaceted fiasco, Microsoft-owned GitHub insisted its security was not breached. Instead, we're told, "compromised OAuth user tokens from Heroku and Travis-CI-maintained OAuth applications were stolen and abused to download private repositories belonging to dozens of victim organizations that were using these apps."
Salesforce-owned Heroku confirmed someone compromised an OAuth token – presumably an internal staffer's token – to get into Heroku's GitHub account and rifle through, and potentially update, users' GitHub repositories "using OAuth tokens issued to Heroku’s OAuth integration dashboard hosted on GitHub."
Efforts by Salesforce-owned cloud platform Heroku to manage a recent security incident are turning into a bit of a disaster, according to some users.
Heroku has run security incident notifications for 18 days and appears to have upset several of its customers due to a perceived lack of openness and communication.
Biting the hand that feeds IT © 1998–2022