Former Google SRE here--not on GCP.
Cloud operations has never been about more servers = more stability. Or even ==. Cloud operations give you the ability to improve stability, but this requires that the entire stack be engineered to operate in this fashion.
1) Datacenters can be taken down for routine or emergency service. This can be at the power or water distribution level (although I only observed it at the power level). If you are not in multiple regions, you are NOT HA. If you are in multiple data centers, but they are on the same maintenance schedule, you are NOT HA.
2) OS & firmware upgrades on the underlying hardware, both routine and emergency, happen. If you cannot handle 5% of your servers being down (in addition to a couple of datacenters are down), you are NOT HA.
3) Changes happen. Tracing problems back in a stack as tall as exists at a cloud is not easy, because the entire point of separating the layers is that coordination is not required.
4) I'm not sure that AWS qualifies as having a mature offering. Google never claimed that they would be mature out of the gate. There are major differences in providing external services to internal, and Google appears to have been honest that it is going to take time to match AWS's maturity. The monthly fails during 2016 were certainly undesirable, but I don't even know if I would consider them embarrassing _at the time_. Now would be embarrassing. But we're not seeing that failure rate.