It is surely inevitable now that regulators intervene to mitigate the concentration risk of AWS and Azure. It's insane that essentially our entire financial services infrastructure runs from half a dozen cloud regions around the globe.
AWS Tokyo outage takes down banks, share traders, and telcos
The AP-NORTHEAST-1 region of Amazon Web Services, located in Tokyo, has endured six hours of sub-optimal performance. The cloud colossus's status report states that AWS Direct Connect hybrid cloud networking service had trouble connecting to resources in the region due to "failures in core networking devices". As of 1530 …
COMMENTS
-
-
Thursday 2nd September 2021 10:58 GMT Velv
Perhaps we've become too reliant on the 24/7 nature of everything these days and need our expectations reset.
Outages should be expected. It sucks when it happens to you, but you should have planned for it. Time for people to take more responsibility for their own lives and look to have contingency in place.
I'm not really proposing this is either right or good, but there are some entitled people who think the world owes them everything for no effort on their part, and sadly the world is flawed, it does break, and we need to at least understand that fact.
-
Thursday 2nd September 2021 12:30 GMT Anonymous Coward
AWS' CTO is famous for the quip "everything fails. all the time.", and AWS (i doubt any cloud provider) don't suggest best practice guidance to implement a single point of failure?
If a firm is sophisticated enough to be using Direct Connect, surely they are also sophisticated enough to have a fall back? It's not like AWS don't provide mechanisms for backup solutions?
-
-
Thursday 2nd September 2021 20:43 GMT Ken Moorhouse
Re: Why not "the cloud" ?
Interesting question which I presume because the internet is about routing data, it doesn't need to be aware of which data went where historically, which I assume the cloud must do in order to retrieve it later, a bit like RAID but involving network packets dynamically accessing storage as well as RAID techniques such as bit striping.
So if a section of the cloud goes down, I surmise there has to be a mechanism for piecing together replica data from servers that are unaffected by the outage, and promoting these to be the primary version. Not an easy choice to make automatically where latency can cause a positive feedback situation to cause instability.
-
-
-
-
This post has been deleted by its author
-
-
-
Thursday 2nd September 2021 18:46 GMT bombastic bob
private cloud failover?
Here's a business opportunity for AWS: sell/rent end-users an inexpensive "private cloud failover system" in the form of a private cloud server that is capable of automatically handling the load (albeit slower) while the rest of the network is titsup...
(It could also handle low bandwidth loads and include synchronization when normal services are restored)
Just a thought, anyway. No more ALL eggs in ONE basket. The web publisher/whatever would basically rent or buy the server (maybe co-located on a different part of the internet, or physically local to the customer), and use regular AWS things to manage it (along with everything else). It would be for uber-high-reliability services, like banks, hospitals, governments, military, etc.. And it would handle a portion of the normal load. It would also have to specifically be set up to work when completely disconnected from the rest of the cloud services, and reliably sync up when services are restored.
Something to think about.
(assuming they do not already do this)
-
Friday 3rd September 2021 15:01 GMT Kevin McMurtrie
Re: private cloud failover?
Managing data in multiple places requires completely different application design. Each location has different performance and availability, and it's all dynamic.
AWS does recommend that everything be multi-region. Of course they would, because it's 3x money for them: 2x hosting + failover overhead + moving data
-