Reply to post: Re: RE: asynchronous nature of geo-replication could have led to data loss

Microsoft reveals train of mistakes that killed Azure in the South Central US 'incident'

Anonymous Coward
Anonymous Coward

Re: RE: asynchronous nature of geo-replication could have led to data loss

"Is there something different about Azure" - Yes there is a difference in Azure.

Azure used an architecture principal of Region Pairs. They built their regions as an equivalent of a single AWS Availability Zone (AZ). Therefore you need to run multi-region in Azure to get the same resilience as multi AZ in AWS. Now Azure announced last year that they are going to start to build AZ's but this will mean a rewrite of all their higher level services and could take years in my opinion.

The issue is that Azure Region Pairs have a lot of latency. If you try to run a synchronous commit database over this latency you will end up with terrible database performance. Therefore (just as the VSTS guys did) you need to use asynchronous replication which results in data loss if you don't failover nicely. This is a major problem in my opinion.

AWS AZs are low latency, allowing synchronous commit databases over completely isolated datacenters. AWS should be able to handle an incident that took down Azure single region

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon