IMDB is an Amazon property,not a customer.
If this knocked Netflix offline for long it must have been big given how aggressively they build around failure.
Amazon's Web Services (AWS) have suffered a monster outage affecting the company's cloudy systems, bringing some sites down with it in the process. The service disruption hit AWS customers including Netflix, Tinder and IMDb, as well as Amazon's Instant Video and Books websites. The outage may also explain Airbnb's current …
"Pretty soon we'll get to the place that real computers and real humans are no longer necessary for the proper functioning of this planet."
They never have been, the planet has functioned perfectly for around 4.5 billion years. The planet did shoot itself in the foot though... It allowed humans to evolve.
It fits the first criterion for disaster recovery and issue management:
- Can you blame someone else (and wait for them to fix the issue)?
- Yes -> Problem solved.
That's why cloud is and will be highly successful.
This seems like a GREAT IDEA! Let me just leverage this big company that seems to be a solid bunch of chaps to hold all my important data, because they said they could for next to nothing and they should be online forever! Super!
*puts all data into cloud
Good stuff!
*outage
I'm a stupid dickhead who reads CIO Magazine, and now all my data is missing. :(
Well. That explains why I was having weird problems with IMDB early this morning. (I was up until about 5am California time.) Netflix was mostly running ok for me though. Mostly.
I thought it was my phone having a go. Rebooting seemed to clear my phone networking issues and meantime I switched Netflix to my laptop but IMDB never cleared up. Very puzzling.
I feel much enlightened now.
Usually requires a tensor, but capabilities by provider by cost by (actual!) reliability. Maximize by bed get. {Shrug} I mostly see the least accurate assessment on all axes for self-provisioned resources and costs. That hasn't changed in human history yet, so I won't go on a hunger-strike over it.
Yes, but n worse infrastructures has less chances to break at the same time, while a single big single one, even if less slightly bad overall, when breaks, brings with it a lot of separated, unrelated, "subinfrastructures" without any chances for them.
From a user perspective, I'm more worried of several services becoming unavailable at the same time, than just a single one. Clouds needs to be far, far better than you own infrastructure, not just "somewhat".
Oddly there are a lot of people complaining that it is the fault of an IaaS provider for their customers failure to design redundancy into their solutions. AWS is Infrastructure as a Service, not a managed service. So, though I have my own reasons not to be fond of a few of their services, blaming them for someone elses failure to understand the platform on which they were deploying seems a little unfair.
AWS services went down due to an AWS balls-up. Their customers services went down due to customers not deploying redundant solutions in the AWS cloud - or indeed a multi-vendor deployment, if you have to have your backup options in the same region.
This would all be true if the people knocked offline weren't so publically committed to planning redundancy.
Netflix are the poster child for correct use of infrastructure as a service and if this knocked them offline then something went wrong with the redundancy planning features that AWS makes available.
AWS is a lot more than dumb IaaS these days and this is something bigger than a single availability zone taking a nap.
"AWS services went down due to an AWS balls-up. Their customers services went down due to customers not deploying redundant solutions in the AWS cloud - or indeed a multi-vendor deployment, if you have to have your backup options in the same region."
That. Exactly that.
Cannot be repeated too often.
The main, vast difference between the Cloud falling over and the server room of an individual company is the fact that the company's servers falling over only impacts that company and its clients.
When the Cloud falls down, it takes ALL ITS CUSTOMERS with it, and all of THEIR customers are impacted. That is exponentially different.
And I thought that IT was all about removing Single Point of Failure faults, silly me.
It isn't saying anything significant to point out the scale of disruption of a cloud outage. Society is used to sharing infrastructure. Power, water, transport etc... And now IT infrastructure. The downsides don't out weight the positives,the vast mass majority of the time, which is why cloud is on the rise and will continue to do so.
The sensible IT bod will be making sure they're on the Cloud train rather than just laughing at it when it gets stuck at the station. Society will put up with the outages like it puts up with the thousands of smaller issues each day that are caused by private IT screw-ups.
Companies that really need 100% uptime will still need there own IT (e.g. like hospitals and Data centers have UPS and generation capability) or time some of these will get decommissioned because of the cloud's reliability, (e.g. like the decommissioning of the London underground power station) and then complain bitterly of the very rare outage that takes service out, despite it still exceeding SLA's.
The number of hospitals and datacenters with 100% uptime approaches zero. If keeping things on-premises granted 100% uptime, everything would be on-prem. Instead it's the opposite since engineering, managing, operating and supporting hyper-uptime systems is cost-prohibitive for a majority of companies.
Instead, we've got backups, continuity solutions, etc.
What in house servers? Oh those old things, they were removed right before the IT staff was redundanted. Saved a ton of money that did, they only had to keep one freshly graduated person on staff to go around showing the rest how to use the new cloud based applications. Still looking for a way to get rid of him too.
That's what "moving to the cloud" is really about, saving money by firing as many IT staff people as possible.
The gf and I binge watched episodes of "The Kitchen" and "Futurama" on Amazon and Netflix respectively off and on for most of Sunday. Both services worked flawlessly. However, I was in the market for a new pair of pants (unrelated in any way to the streaming or any other Sunday events) and Amazon seemed to think both of the saved shipping addresses I use were incorrect and could not be persuaded otherwise.
I have always been skeptical of cloud stuff and never really drank that kool-aid, but the ability to share massive infrastructure can have obvious returns to scale, but clearly, if we are talking of something "important", well, nothing beats the good old "on metal".
The services that were down are anything but important. I'm even ok if they are down from time to time due to a glitch or (better) some maintenance. What if it's down: Netflix? Read a book. Amazon? So you done reading all the books you ordered? Tinder? Learn how to read!
"if we are talking of something "important", well, nothing beats the good old "on metal"
You're talking about the unicorn 'on metal' aren't you? The sort that never goes wrong, that always has the right patches, that never has any problem with power or cooling or connectivity, that is only ever managed by people who never make mistakes...
Yes, that cloud stuff is truly flaky compared with unicorn metal!
Organizations should consider these AWS failures in the scope of all cloud hosting. History has shown that putting all production with any one company is not wort the risk, no matter how large the provider. Unfortunately AWS has had far too many of these outages and to add insult to injury, the remedies they offer are terrible.