Reply to post: Big cooling tower cluster burns down, hundreds of server shut down by hand

Sysadmin left finger on power button for an hour to avert SAP outage

Anonymous Coward
Anonymous Coward

Big cooling tower cluster burns down, hundreds of server shut down by hand

At a TOP2000 company in 1999. Three on-site datacenters (one main, two backup) were powered by an on-site big oil power plant, and additionally connected to the power grid with two (redundant) power lines. The data centers had a cooling tower cluster building near by. The cooling tower ran out of water and the rotating parts in it caused the towers to caught fire. To aid the on-site firefigthers, the internal power plant had to be shut down, they basically shut of all electricity on site, but completely forgot about the data centers. The data centers automatically fall back to battery power that would last for 15 to 20 minutes. The whole complex got evacuated, the admins refused to obey the order and stayed in shut down hundreds of servers one by one, to prevent serious problems with SAP R/2, SAP R/3 clusters and Oracle databases. The cooling towers building burned down to the ground, and with it many cars on the car park near by.

Unfortunately, they learned little from the incident. They rebuilt a carbon copy of cooling tower cluster building, and the admins kept the non-automatic monkey patching method. 15 years on, almost the same incident happened again. This time the power lines to the grid got overloaded, the lines burned through, the power plant turned off automatically, the data centers switched to battery. The on-site telephone system was now Cisco IP phones, and had no backup battery. So no phones. The cell phone tower was on-site, powered by the same power line, so no cell phone coverage as well. So the only communication method were a few analog walki-talkies of the firefighters. Admins had to shut down servers by hand one by one again, this time even more in hurry, as the backup batteries only would last for 15 minutes and lots of more servers and virtual servers were added in the meantime. Obviously, this time some servers were gracefully shut down including what admins thought was important (SAP) but not Oracle, Microsoft, Cisco, etc. Ups.

Have the changed a thing, probably not.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon