
Cloud Burst?...
... Storms expected later.
I'll get me ...
... umbrella
A major server outage at Microsoft's subsidiary firm Danger, which provides Sidekick data services to T-Mobile customers, has forced the company to admit that many of its users have lost personal information that was stored on the system. T-Mobile published a miserable apology on Saturday, in which it said that Microsoft/ …
The data was held in Microosoft's data centers, yet there was no back-up. That's not a "technical snafu." That's just unbelievable incompetence.
Judging by the press reports, this incident from Microsoft has caused a loss of public confidence in 'cloud' computing. That's rather ironic, considering Microsoft had been the one company reluctant and sllow to embrace the cloud.
And in the same week, Microsoft launched its 'MyPhone' service, a back-up system for users of its hapless Windows Mobile phones. Did it remember to make a back-up of that too?
I am reminded of the investment fund that a few years ago had to write to all their clients admitting that they'd lost all their money. That fund had a Japanese-looking name that, if you tried to read it out, slightly but recognisably resembled "I f*** you".
(Can anyone remember the actual name of that fund?)
"Dont put all your eggs in 1 basket."
Unfortunately it doesn't take much to fool idiots. Rename internet data storage to a buzzwordy name like "The Cloud", vomit up some marketing hype and FUD , usually along the lines of anyone who doesn't believe said hype is a dinosaur still living in the past , and you'll have the morons eating out of your hand. Then its $$$ all the way.
"If you were planning a major SAN upgrade wouldn't your first step be to make sure you had a full backup of the data in place in case something went wrong."
You'd think! But you'd be amazed at the number of times I've been to project meetings only to find essential points like taking a backup had not been included in the plan. The results can literally be fatal for a company. One company we worked with failed because they lost their accounts database and simply didn't know which of their customers to charge for what! The original problem was due to a faulty backup implementation that they knew they had a problem with, but had pushed to the bottom of the priorities list.
"If you were planning a major SAN upgrade wouldn't your first step be to make sure you had a full backup of the data in place in case something went wrong."
Backup systems are almost always an expensive, later option for management. Us in the know, know we should always have that as top priortity, 'What's the cost of backup software+tapes/VTL compared to your data's worth?', we cry time and again. Orders of magnitude that even the late great Carl Sagan would balk at, one would assume!
Still we rumble on, having to justify the cost of another box of tapes compared to the latest, trendy Web2.0 project over in developement right now that's chewing up cash like there's no tomorrow!
It would be nice to know which scenario caused the failure:
a) Lack of due diligence: should have known when MS bought the company.
b) Migration of platform from *ix to Windows going horribly wrong
c) penny pinching before migration.
d) one of those horrible multiple failure cluster-fu**s
And Air New Zealand call IBM 'Amateur'
As one of my old system programmers used to say "f**k going forward, as long as we can go backwards we'll be OK"
Early BOFH I think, redirect backups to /dev/null, it makes the backups so quick.
What's the IT angle?, they know nothing about it....
Nelson Muntz icon please ms Bee
Wasn't MS's cloud.. Most likely it was Danger's Pre-Aquisition Data center. I know the Reg like to throw MS under the bus any chance it gets, But I'm pretty sure that Integrations of Dangers data into an MS format was probably the last thing on the integration schedule. Anyone who's done large scale systems integration would know that.
So Suck on REG. Since elsewhere comments on the problems have all come from Danger's side, not MS reps.. I'm sure this is sucking it for MS, but really, can't see how it is responsible directly, asides from just owning Danger.
Amazing the number of commentards above who are perfectly willing to hang, draw and quarter MS based on - well, no evidence whatsoever. Am I safe to assume these are the same people complaining about the government/police-state continual erosion of civil liberties etc?
Obviously, this is a huge fuckup etc etc, but there is ABSOLUTELY NO evidence to backup the dozens of claims above that "all the eggs were in one basket" or that there were no backups.
"there is ABSOLUTELY NO evidence to backup the dozens of claims above that "all the eggs were in one basket" or that there were no backups."
did their tapes catch fire or something? why would they say they can't recover the data if they have backups?
If the tapes DID catch fire, well a stupidly implemented backup procedure isn't much better than none at all.
Look, the facts are that the service went tits-up without the help of any hypnotic enlargement services - umm, sorry, other story. It doesn't really matter who cooked this one, but MS is responsible as owner. If this wasn't sorted as part of takeover due diligence I'd recommend tarring and feathering whoever did this diligence, or whoever approved takeover without putting mitigation strategies in place.
If a service goes down, recovery must be in place unless shareholders have stated that they're perfectly happy to lose revenue in exchange for a lack of decent IT management. I haven't come across shareholders that generous, so someone's cojones will be on the block for this. And rightly so.
Whoever the owner is, this is no way to run a service - no excuses.
There are also a lot of people pointing out a project like this that goes pear shaped and they announce that they cant recover their data would be fairly indicative of not having a working backup and recovery plan. I don't know if you work in IT AC (and I suspect you don't), but in the real IT world we actually practice our backup and disaster recovery plans by restoring our database at the hot site.
In my last DBA job the most we would have lost is the last 5 minutes data, databases were backed up every hour, and the database logs every 5 minutes. To quote Gene Kranz: "Failure is not an option" (well not if you want people to take you seriously).
I'd also like to point out that I (and a lot of others) did not mention MicroShit or Danger in out posts or if *nix was better.
Minor storm in the cloud developed severe updrafts, power surges and data loss. Attempts at data recovery show wreckage similar to a tornado debris path, our techs are madly running around picking up any pictures and papers, but water damage has caused most of the ink to run, rendering documents unreadable and photos stained to the point that they're unrecognizable.
Sorry for the inconvenience, but try to keep your phone charged for the duration, and maybe you want to shut it off till we know resyncs won't overwrite your phone's data with gibberish or flatly wipe it clean...
I used to do support for T-mobile and the Sidekick Data service went down on a regular basis. The fact that it appears to have gone permanently tits up is not a real surprise, it was always unreliable, insecure and badly managed and that was before Microsoft had a chance to really screw it up. Given the poor quality of the Sidekick devices (I once performed an exchange for a woman on her 22nd Sidekick exchange due to device failure) I'm completely unsurprised the backend infrastructure is just as badly put together.
First of MS does not do their own maintenance. It Use to be Dell. They have switched over to HP/EDS.
Secondly I worked the MCI data center were danger hand their servers . First major screw up. Some one hit the EPO. This happened twice. The EPO was not covered . (time for a new co lo). Second is their software. We were constantly getting tickets to reboot the dangers servers . MS needs to replace all of danger software and servers. Not sure MS could screw the side kick up, only cause its already screwed up.
I can imagine the Danger defense now: "I sent you an e-mail saying we MUST make a backup. Just check my e-mail? I sent it from my phone weeks ago."
From the Danger web site:
* 4,680 messages / month / user. (reportedly 1M users)
* Danger is now a part of Microsoft's new Premium Mobile Experiences (PMX) team
(the Premium team, really?)
* mobile handsets connected to powerful hosted back-end services
(powerful hosted? Or powerful services, unpowerfully-hosted?)
I had a Sidekick II for 4 years and loved it. The service would go some times, t-mob gave a month/week free for one very bad internet outage. The keyboard was better than my Android G1's. Very good build quality, made by Sharp i think. Again, v superior to my HTC Android G1.
No back-up's? Ever? That's as sensible as lending money for a very expensive house to someone who doesn't earn enough to pay you back. Oh.
Are you replying to me or the person I quoted in my post? I'm well confused.
"There are also a lot of people pointing out a project like this that goes pear shaped and they announce that they cant recover their data would be fairly indicative of not having a working backup and recovery plan."
Yes I was one of those people. Am I wrong?
/Paris because she's always backed up.
I happen to like the Sidekick, and until a few days ago I was considering buying an LX. I miss a lot of nice features on my old SKII, starting with the keyboard.
And for the record, I haven't been 13 for 38 years and some months.
Just because you don't find value in something doesn't mean nobody else should either. Thinking otherwise is just arrogant presumption.
MS can't blame this on the people they might have contracted out to. Their customers have a right to expect that an outfit like MS is capable of ensuring that their subcontractors can do the job properly.
Nor can they blame this on legacy systems. MS have owned Danger for well over a year and that is enough time to improve any broken systems, at least to the extent of firing up a backup storage system.
MS's main issue here is that they want to distance this failure from their Azure offering. MS + "cloud failure" could severely dent confidence in Azure.
Oh well Mr Ballmer, while it might not be your fault it's still another screw-up on your ship!
... a DBA is only as good as his last backup.
Seems to work for more than DBAs.
I've been in upper IT management for a couple decades, in infrastructure/operations/database areas. My people and our outsourcers have had to perform datacenter moves, outsourcer swaps, major hardware and software upgrades. Never ever ever ever had anything close to this.
In all of these cases, while the event itself took only hours or a day or two, the planning/testing/risk management took months. We had to hold up the vendor swapout for three months until we were covinced everybody (us and the two vendors involved) had it right.
How much planning took place before THIS upgrade?
This seems to be a funny name to give a company that's supposed to keep your data. At least this incident is much, much better than the MobileMe incident Apple had. Losing my backups might be ugly, but getting my data wiped during a sync would truly piss me off!!
This incident is going to hit MS, even if it wasn't because of MS stuff failing, which I wouldn't be surprised if it were the case. I've recently discovered that there is no real way to backup Active Directory or ADAM other than Microsoft Backup. No shit...