back to article Google: We had to shut down a datacenter to save it during London’s heatwave

Google has revealed the root cause of the outage that disrupted services at its europe-west2-a zone, based in London, during a recent heatwave. "One of the datacenters that hosts zone europe-west2-a could not maintain a safe operating temperature due to a simultaneous failure of multiple, redundant cooling systems combined …

  1. Will Godfrey Silver badge
    Facepalm

    Well...

    At least they admit their failures and don't pretend everything is just fine.

    1. Lon24

      Re: Well...

      At least it appeared to be be a controlled shutdown which while causing some disruption should mean an orderly restoration of services should be within the contingency plans with, hopefully, nothing lost. In technical terms - a hiccup.

      Whereas poor St Thomas' & Guys also had an unplanned shutdown same day due to the heat and unspecified issues (cooling or power or both?). It appears that it was less controlled and contingency plans failed completely at the time.

      Worse, restoration of services is still, I hear, not complete. Patient welfare was/is seriously compromised. That's probably lethal or permanently damaging to some.

      I'm guessing the pressure to run NHS 'on the cheap' compared to other health systems means when it comes down to retaining excellent IT staff to manage their way out of catastrophic situations or having sufficient redundancy has been eroded over time. Similar situation at a University not very far down the road that took months to restore services from a hack.

      Hence I have some sympathy to the remaining IT staff who take the immediate blame and are expected to restore the situation pronto and then get more blame when they don't..

      1. ThatOne Silver badge
        Devil

        Re: Well...

        > retaining excellent IT staff to manage their way out of catastrophic situations

        You don't have an MBA, do you: Excellent staff usually commands prohibitive salaries, and handling catastrophes is what we pay insurance for. Service disruption is not an issue, especially if we have a captive clientele, like NHS or universities.

  2. Pete 2 Silver badge

    Heat island

    > the hottest day on record in London

    Not helped by London being generally warmer than areas outside the city. An effect worsened by all the datacenrtes enormous power consumption.

    In addition, given the serious lack of electricity infrastructure in the capital, you have to wonder how sensible it is to locate datacentres there. Or to give them planning permission.

    It isn't as if they create that many jobs, either.

    1. Hans Neeson-Bumpsadese Silver badge

      Re: Heat island

      Not just the electricity infrastructure...land/property in London is really expensive. I would have thought it'd make sense to build these things outside of areas of prime real estate.

      1. Anonymous Coward
        Anonymous Coward

        Re: Heat island

        Since all the reports were referring to Pacific time, my impression was that the reference to "London" meant somewhere in SE England with an 020 dialling code. Still expensive but I'm guessing the extra cost of location was outweighed by the benefits of having it near a large number of users and better communication.

        1. katrinab Silver badge
          Meh

          Re: Heat island

          Sure but stick it in an industrial estate in Slough or Luton or somewhere similar. Surely that would be close enough?

          1. mantavani

            Re: Heat island

            Perhaps somewhere near 'London' Luton airport?

            1. Dale 3

              Re: Heat island

              Or indeed 'London' Oxford Airport, which is closer to Luton.

              1. katrinab Silver badge

                Re: Heat island

                And, depending on how you measure it, slightly closer to Birmingham than to London, and not particularly close to Oxford either.

              2. anothercynic Silver badge

                Re: Heat island

                Apparently that moniker comes from "an hour's travel" which makes Stansted worse than Oxford. Even us Oxfordians shake our head at that ridiculous rebranding exercise (and it's not helped - at all). It continues to just be a distant cousin of Biggin Hill or Farnsborough with lots of corporate jet traffic.

              3. NeilPost

                Re: Heat island

                Not that any carriers fly from London Oxford Airport.

          2. Mr.Nobody

            Re: Heat island

            Slough is supposedly green as well since the power all comes from a wood pellet or pulp energy source in the middle of the estate.

            Lots of big data centres out there get to consider themselves environmentally friendly because of this fact.

            1. Korev Silver badge
              Mushroom

              Re: Heat island

              But what happens if the friendly bombs finally fall?

      2. Headley_Grange Silver badge

        Fans

        And if they built it next to a windfarm they'd get green energy and the extra benefit of the windmills keeping it cool!

      3. anothercynic Silver badge

        Re: Heat island

        The only reason the massive data centre in Olympic Park exists because it was the press centre for the 2012 Olympics and the plan was to put Amazon in it (and they didn't want it). Whether any of the universities from oop T'North use any of it (since it's just on the other side of the park from them) is another guess.

        But yeah... London is getting hotter every year, and yeah, cramming more data centres into Canary Wharf and other parts of East/Central/West London is not helping.

    2. ffeog

      Re: Heat island

      > you have to wonder how sensible it is to locate datacentres [in London].

      True, but don't forget how many financial and fintech companies are in London, and the importance of locality and speed for certain missions, eg. High Frequency Trading (for the more general case beyond the nearby/ colocated FPGA stuff). Or where a lot of data benefits from being near another load of related data for big data operations where transit latency would be multiplicative.

      Unless they could all agree where to keep their operations outside London but retain the locality benefits.

      1. Anonymous Coward Silver badge
        Boffin

        Re: Heat island

        Generally they'll want their DC as close as possible (in latency terms) to a major internet exchange, eg LINX.

        Propagation delays happen, even in fibre (the speed of light is finite and in glass it's even slower) and that adds milliseconds to each packet.

        1. anothercynic Silver badge

          Re: Heat island

          Hence TeleHouse being all over the Docklands and East London.

      2. Will Godfrey Silver badge
        Mushroom

        Re: Heat island

        I take your point, but personally I consider HFT an obscenity!

      3. NeilPost

        Re: Heat island

        Well the data centre’s proximity to London Financial districts and high frequency trading didn’t help them exit stocks for cash before my pension took a 20% @#% fucking in the literally *weeks* running up to COVID lockdown.

    3. Alan Brown Silver badge

      Re: Heat island

      The datacentres in question are almost all based around Slough, not closer in

      They were encouraged to setup there BECAUSE of readily available infrastructure and because siting them under the heathrow approach/departure path prevents people building houses there and complaining about the noise

      I've questioned the wisdom of such siting ever since I found out about it. Aircraft occasionally fall out of the sky and taking out data centres might not be "loss of life" but it can be economically devastating

      At least one airport I know of spent billions quietly purchasing all the land under the approach paths to ensure it stayed undeveloped farmland (or warehousing, closer in). Kinda hard with places like Heathrow but something worth considering for ones further out

      1. Mr.Nobody

        Re: Heat island

        Also it's considered green since the electricity already comes from a renewable resource. I have heard you can also still get steam piped into a location in the estate if you would rather power equipment off of it instead of electricity.

      2. ThatOne Silver badge
        Holmes

        Re: Heat island

        > all the land under the approach paths

        From what little I have observed, you hear the planes just as well on both sides of their flight path. Probably less than if they fly right over your head, but still loud enough to be annoying. It's a difficult problem, since airports need to be as close as possible to big cities, but cities don't want to be anywhere near airports...

        Farmland around an airport would seem a good idea, except airports don't want too many birds living nearby, and there is nothing like the free buffet of a field to attract large amounts of birds. (Also the pollution might be a little higher than elsewhere.)

        Warehouses and datacenters might be the ideal solution, as long as the constant noise (vibrations) doesn't break things inside.

        1. rcxb Silver badge

          Re: Heat island

          It's a difficult problem, since airports need to be as close as possible to big cities, but cities don't want to be anywhere near airports...

          Not difficult, really. Put the entry point in the city and have a tram every minute that moves people a few miles to the actual terminal near the runway.

          Alternately, I'd certainly enjoy seeing a kilometers-long autowalk moving at 100km/h.

        2. Anonymous Coward Silver badge
          Paris Hilton

          Re: Heat island

          Farmland doesn't have to be arable. Fields of sheep, cows, pigs etc are less likely to attract flocks of birds.

          1. Yet Another Anonymous coward Silver badge

            Re: Heat island

            But with the current government the risk of flocks of flying pigs

      3. Anonymous Coward
        Anonymous Coward

        Re: Heat island

        "Aircraft occasionally fall out of the sky and taking out data centres might not be "loss of life" but it can be economically devastating".

        I worked at a company (in the travel business) next door to their main Data Centre which was allegedly hardened to cope with an airplane strike.

      4. Headley_Grange Silver badge

        Re: Heat island

        "At least one airport I know of spent billions quietly purchasing all the land under the approach paths to ensure it stayed undeveloped farmland (or warehousing, closer in). Kinda hard with places like Heathrow but something worth considering for ones further out"

        They don't need to do this in the UK for airport operational safety. Airports have a safeguarded area around them. Any building or development applications which would compromise the safeguarding would be referred by the council planning inspector to the airport who can request a safeguarding report from the developer and veto it if it doesn't meet their requirements.

      5. anothercynic Silver badge

        Re: Heat island

        Heathrow's also been at it with the "buying on the sly". They bought quite a bit of the two villages they want to raze for the third runway; of course Greenpeace, HACAN etc all caught a whiff of it and bought land too (and sold off parcels to their supporters - trying to do a Narita in West London/Middlesex).

        It makes a lot of sense to buy the land, sit on it, rent it out until the time comes that the development starts and you start booting people out.

    4. Anonymous Coward
      Anonymous Coward

      Re: Heat island

      Keep in mind it's a San Francisco company. And by San Francisco, I mean somewhere in the SF Bay area, which is a chunk of California that includes places nearly a hundred miles from the city limits of SF.

      "London" is pretty much anything south of Scotland as far as most Americans are concerned.

  3. andy 103
    FAIL

    Move it up north

    Move it to Manchester. Or somewhere outside London.

    Yeah it's easier said than done.

    But:

    1. Land / property is cheaper than London

    2. Big northern cities have the right infrastructure

    3. It's generally a few degrees cooler (although not an exact science)

    People really need to learn a lesson that London is shit and just because you have expensive property / infrastructure there it can still suffer failure. Of course that doesn't mean everything in the north just works - but at least you'll have a few extra quid to spend on some fans!

    The report doesn't explain why the cooling systems failed. It's because they didn't have redundant cooling systems. That's really all it can be. In a fucking datacentre, in London, owned by one of the richest companies in the world.

    1. Anonymous Coward Silver badge
      Holmes

      Re: Move it up north

      > "It's because they didn't have redundant cooling systems. That's really all it can be."

      But they (in theory) have redundant datacentres instead - they're needed anyway, so might as well use the ability to failover between them.

      In this case it turned out that the redundancy they thought they had wasn't as good as they thought it was.

    2. Anonymous Coward
      Anonymous Coward

      Re: Move it up north

      They mentioned redundant cooling systems

      Odds are there was a SPOF in there someone overlooked

      I'm currently having to refrain from smashing someone's head into a desk for designing exactly thie issue into the redundant systems of my datacentre - DESPITE my having warned to avoid doing so. It's going to cost ~50k to fix on a 150k cooling installatiion but would have added less than 5k to the installed cost

      Said individual got patted on the back for saving money - and the fixup costs won't fall on him

      1. ITMA Silver badge
        Devil

        Re: Move it up north

        "Said individual got patted on the back...."

        With a baseball bat? Or, since this is a UK based story, a cricket bat?...

        "Odds are there was a SPOF in there someone overlooked..."

        Usually the short sighted senior level manager/director who signed off on the cheaper option because it was, well cheaper.

        "I'm currently having to refrain from smashing someone's head into a desk..."

        Doors work just as well and usually easier to get away with as "accidental". FD60/120 doors work really well.

        Oh sorry, was that your head on the door? I'm sure the dent will come out.

    3. anothercynic Silver badge

      Re: Move it up north

      Moving up north didn't stop a data centre in Leeds having a cooling moment though...

  4. pavel.petrman

    Hot cloud

    Wasn't reliablity and flexible workload migration and routing the selling point of the cloud infrastructure? If one can't rely on Google being able to run their cloud reliably, what sense does it make to use it?

    1. Craig 2

      Re: Hot cloud

      Datacentres are a bit like plane crashes are to car crashes. It's much safer to fly, but they do make it to the news when one goes down....

      1. pavel.petrman

        Re: Hot cloud

        Of course they are. But we already had datacentres before the much touted cloud. One just had to organize the multihoming himself or delegate it to a provider. Big customers already had the requisite elasticity in performance and resources even on-site (I remember daily core count and memory size adjustment in a leased on-premise server in 2005).

        I'm sure everyone here already understands "the cloud" as "someone else's computer", my rant was meant more towards the manglement, marketing and beancounter side of affairs.

      2. ITMA Silver badge

        Re: Hot cloud

        Or on fire..... OVHCloud...

    2. Timochka

      Re: Hot cloud

      Because in this case everything worked entirely as designed?

      If you are deploying your workloads in the cloud in a single availability zone, you have no business being allowed near a console - or at least, you don't get to complain if you have downtime. If you were deploying them across multiple availability zones, then you didn't have a problem.

      1. Anonymous Coward
        Anonymous Coward

        Re: Hot cloud

        But Google had not a tested procedure in place to shutdown a DC and properly re-route traffic to other DC in the same region, and instead panicking they did a mistake and cut off all the DCs in the region?

        Did they really never plan for a single DC shutdown?

        1. yetanotheraoc Silver badge

          Re: Hot cloud

          "... panicking they did a mistake and cut off all the DCs in the region?"

          Cooler heads did not prevail.

        2. Yet Another Anonymous coward Silver badge

          Re: Hot cloud

          >Did they really never plan for a single DC shutdown?

          Well they tried searching for the instructions on Google, but it was down

    3. v13

      Re: Hot cloud

      No, not exactly. There are global services and regional/local services. A single VM is mostly local. It can be migrated under certain circumstances but this is a sensitive process and many users care about where their workloads run. You can't just move a VM from England to Germany without the user knowing. The same is true about disk storage.

      Users in regional or zonal services need to make sure that they don't rely on a single zone/region because the building can literally be destroyed by an accident. So either you use higher level services that do this for you, or you use lower level services and you're responsible for ensuring redundancy.

  5. Yet Another Anonymous coward Silver badge

    europe-west2-a down

    Don't see why it would affect anyone in airstrip-one Britain

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like