IT downtime not itself going down, power failures most common cause

Infrastructure operators are struggling to reduce the rate of IT outages despite improving technology and strong investment in this area. The Uptime Institute's 2022 Outage Analysis Report says that progress toward reducing downtime has been mixed. Investment in cloud technologies and distributed resiliency has helped to …

  1. chivo243 Silver badge

    Gotta say...

    In the past 10 years, with many outages, the Internal IT infrastructure was sound, it was a power failure city wide (UPS ran out of runtime), or ISP issue with connectivity, not failure due to the internal infrastructure. Although the day I was married in 2014 and off duty, it was a network issue! We still laugh about that day!

  2. A Non e-mouse Silver badge


    Whilst the article talks about the MTTF getting longer, I wonder what the MTBF is? If it's not changing, then clearly the complexity isn't worth it. But if the MTBF is increasing then you need to closely look at the trade-offs.

  3. ecofeco Silver badge

    I NEVER get tired of saying it

    So how's that cloud thing working for ya?

  4. Alexander Caplan

    User Error

    “Power failure” as in…

    1) Incorrectly commissioned uninterruptible power

    2) Ignored failure of redundant PSUs

    3) Stale UPS battery packs

    4) Poorly maintained diesel generators

    5) Leaks from incorrectly commissioned fire suppression

    6) Failure to properly balance power loads

    I’ve observed them all over the years.

    And a host of other totally unrelated and avoidable issues noted down as “power events”.

    In essence, the vast majority of outages result from human error/negligence.

    A problem that’s likely to only get worse as tenure of experienced staff decreases — along with the conscientious of the next gen of “worker”.

