back to article Datacenter outages are on the decline, but when they hit, they hit hard

The frequency and severity of datacenter outages is on the decline, yet when incidents do occur they can be very costly to the organization involved, with power issues leading to the most serious blackouts. While the datacenter footprint is expanding to meet the demand sparked by generative AI mania and more, the overall …

  1. Version 1.0 Silver badge
    Pint

    A big change is a change but nothing else.

    Datacenter outages are on the decline, but when they hit, they hit hard

    It's like being told that there are a lot less drunk driving accidents these days, and then everyone starting to think that means you can drink when you go driving. If you are working with a Datacenter then the safest thing is to create an environment that will allow everyone to work around an outage. Relying on a Datacenter with no precaution thoughts is not too different from having a drink before you drive.

  2. sedregj Bronze badge
    Windows

    I have a small one

    Our Data Centre or "Computer Room" as I call it has an air conn unit (two actually: the old one is kept as a backup). It also has a diesel genny and a lot of small and not so small UPSs.

    Each system in general has two power supplies and they are fed from different UPSs. The UPSs are fed from a power distribution board that has feeds from the mains and the genny on it. Some switches only have only one PSU - we have two of each of those and devices are cross connected to each one.

    The generator cost around £10,000 including fitting and can spit out quite a load. It is basically a lorry motor on a set of rails with a fuel tank, bolted down. It's not rocket science.

    I rather enjoy turning the key on the distribution board that cuts the mains and switches to the UPS/genny. You hear some alarm beeps for around 15s as the genny is fired up and then they go quiet as it takes up the load.

    We do that test regularly - monthly at worst. We also do it whenever a new customer wants to divest "to the cloud" and we are showing them around our bijou cloud. It isn't as cool as AWS or Google thingie or MS whatevs but our customers get to see their stuff if they want to and talk to humans if we go off-line or otherwise fuck up.

    Anyway, running a big cloud is basically the same but with more stuff. However, which corners are being cut to trim costs and enhance margins? Who knows? My little cloud is probably relatively expensive but maybe not. They do seem to have some very opaque pricing for some aspects.

    The hyper-scalers will basically run according to something that looks more like actuarial tables. I can understand why but none of that is published anywhere. I know for a fact that some quite large cloudy customers have no real idea of what actual quality of service they actually get in any form.

    This comment is starting to ramble. so I will stop here.

    1. Claptrap314 Silver badge

      Re: I have a small one

      First--don't forget to check the fuel level after each test. :D

      Second, having worked as an SRE at Google, I KNOW that 7-8 years ago, the average data center of theirs was only available ~85-89% of the time. (The DCs were divided into groups, and would be taken down for maintenance for four weeks a year.) The magic of SRE is what allowed us to deliver 5+ 9's from that.

      There were some initiatives near the end of my time there to eliminate those planned outages. Early results were...underwhelming.

      But it is most definitely not free. You have to architect your systems for a multi-DC environment, and there is a lot there.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like