
Sod's (Murphy's) Law
Never make a statement such as we have not had any downtime in the past ..... years/months/weeks. Doing so is guaranteed to make something bad happen.
A major central London data centre owned by Level 3 overheated in Sunday's sunshine, taking down major websites including the popular music service Last.fm. Temperatures at the facility in Braham Street in the City topped 50°C at 7pm, after one of five chillers failed in the afternoon heat. According to data recorded at City …
Clarify Case# 3421048
Classification: Hazardous
Date GMT: 31st May 2009 11:30 GMT (12:30 BST)
Location: Braham Street, London
Description: Braham Street Chiller 1 failure
Start Time of event:- GMT 31st May 2009 12:50 GMT Resolution of Time:-
Duration of event:-
Chiller No.1 at Braham Street, London is currently out of service due to loss of gas. As a result, the remaining chillers 2, 3, 4 and 5 are reporting high temperature alarms.
All floors within the Braham Street site are affected.
The current temperature on the floors is as follows:
5th Floor – 27.9 degrees C
4th Floor – 27.6 degrees C
3rd Floor – 26.7 degrees C
2nd Floor – 26.2 degrees C
1st Floor – 24.3 degrees C
Action plan:
.
Level 3 staff and technicians are on site.
The Braham Street air conditioning plant is stable
Temperatures are gradually coming down.
The integration of temporary chiller is still ongoing at this time.
The repair of the faulty chiller will commence this morning.
The temporary chiller has been connected to the existing system.
The temporary chiller is being closely monitored before being activated.
Further updates will follow
Regards,
European Technical Service Centre
Level 3 Communications LLC
Mail: eusupport@level3.com
Clarify Case# 3421048
Classification: Hazardous
Date GMT: 31st May 2009 12:30 GMT (13:30 BST)
Location: Braham Street, London
Description: Braham Street Chiller 1 failure
Start Time of event:- GMT 31st May 2009 12:50 GMT Resolution of Time:-
Duration of event:-
Chiller No.1 at Braham Street, London is currently out of service due to loss of gas. As a result, the remaining chillers 2, 3, 4 and 5 are reporting high temperature alarms.
All floors within the Braham Street site are affected.
The current temperature on the floors is as follows:
5th Floor – 27.9 degrees C
4th Floor – 27.7 degrees C
3rd Floor – 26.5 degrees C
2nd Floor – 26.0 degrees C
1st Floor – 24.2 degrees C
Action plan:
.
Level 3 staff and technicians are on site.
The Braham Street air conditioning plant is stable
Temperatures are gradually coming down.
The integration of temporary chiller is still ongoing at this time.
The repair of the faulty chiller will commence this morning.
The temporary chiller has been connected to the existing system.
The temporary chiller is being closely monitored before being activated.
The monitoring and testing phase of the addition of the temporary chiller is still ongoing.
Engineers are on site and working on the problem with chiller 1.
Further updates will follow
Regards,
European Technical Service Centre
Level 3 Communications LLC
Mail: eusupport@level3.com
Outside air temp of 23 C and they can't cope? Wait till it hits 38 !
They need to the capacity of all 5 chillers just to cope with a warm ambient external air temp and they don't have any spare capacity in the event of even a single chiller failing?
Want my advice? Customers should take their business elsewhere, away from Level 3.
Sounds like someone has messed up in the design of the heating and ventilation system/environment design - that or, over time they've put more and more equipment into the data centre and exceeded its rated capacity.
When is the most probable time of a chilling or heating system failing, when it's being heavily used, such as in summer for a chilling system or winter for a heating system. Same when a light bulb fails, when you turn it on!
"Yes, we have double redundant power supplies to all our servers, our hard drives all use mirrors, but we don't have redudant air cooling capacity, so if it gets warm in summer our servers will go down and you will lose business".
When selecting a service provider do companies really want to get involved in assessing the providers environmental systems? !!
there is a lovely data centre in Newbury that is an old nuke bunker and the only real way heat can get in there is the front door (that is about a foot thick) and they have lovely staff there too. the company also has one in Kent, they are a lot nicer that our previous data centre in the docklands apart from the radiation suit that you have to put on to visit your servers.
There are also big plans to build server farms at Keflavik in Iceland.
It's rarely more than 23C outside, so the cooling is minimised; and Landsvirkjun, the Icelandic power company has offered 20 year guarantees on power pricing. (Not to mention all that power is carbon-free). Landsvirkjun have also said that all future big power projects will either be for server farms or for solar silicon production; shifting away from aluminium smelting which has been their biggest customer so far. For the Icelanders that'd be a big benefit since aluminium is a commodity whose price fluctuates crazily with the state of the World economy.
There's also plenty of opportunity to use Iceland's geographic position between the US and Europe as a way of shifting load from one side of the Pond to the other as the day passes.
Shame about the tits-up economy really.
This post has been deleted by its author
The probably spent plenty on servers, and neglected good HVAC design. Building and climate control design should be considered with just as much forethought as the server racks contained within.
23°c is trivial. Using fresh air intake a sensible temperature could be maintained even WITH compressor failure. This is very poor indeed.
Think about it - such enclosed places are only cool because there aren't huge racks of electrical equipment heating them up. The number of times I've had customers suggest "why don't we put our server in <some cool at the moment but with no ventilation place> ?", and I have to explain that when they stick a few hundred watts (or in some cases a kilowatt or two in a small cupboard with no ventilation then it won't be long before it ceases to be nice and cool.
As an aside, our server room is cooled entirely by ambient air - all it takes is some big fans. I do need to persuade the boss to spend money on some more fans, as we've reached our limit now. This summer could be interesting, but we are just about managing this heat :-(
I remember, _years_ ago, when I was working for a public utility in the Caribbean, and they designed their data centre: They had _three_ process coolers, any one of which could handle the full load (though only by running at red-line) and two of which were to be running at 60% max under normal conditions. If one failed, the third unit could be brought up and running in under five minutes. Actual testing showed that one unit could handle the load with an external temp of 35 Centigrade for 45 minutes before having problems from being overloaded. Under normal circumstances only two coolers would be running, the other would be on standby or undergoing maintenance... and maintenance would be done in December and January, and for all three units in rotation, so that come June-July-August-September when temps of 35 and more could be expected all three units would be _known_ to be working. Feckin' hell, if a Caribbean utility could se this 20 years ago, what's this lot's problem? Yes, it cost money to set up, but it was cheaper than what would happen if the system had to shut down...
It was pretty, pretty hot in there last night.
Slightly odd, though - we were in powering down servers (which we could have powered off remotely) and storage (which we couldn't), but the DC seemed almost empty. Most of the customers' kit that I saw stayed on, so I expect there were a few fried boxes today.
It was 45 degrees by the external walls but felt significantly hotter in the centre of the floor.
I just remembered, we don't have any air cooling in our server room, it doesn't even have a window.
What am I going to do, the weather is hotting up and my servers might fall over?
Maybe I should ask the boss for a portable aircon unit.
With the last 2 years not having a summer the server room hasn't actually needed any
additional cooling.
I might go into the office to see how warm it is tomorrow.
Bloody global warming/cooling/climate change/whatever
I feel the need to harass somebody
Flames because i guess it's gonna be hot in our server room
Hmm. Mountain or hill caves seems to be answer. Constant temp, little chance of flooding. So for a given set of servers, you need constant A/C, without having to worry about outside temperatures or sudden external loads, such as those from a heat wave or even the wrong kind of sunshine. So if you have overcapacity, it STAYS at overcapacity if one of the chillers goes down.
@Chris: see why underground might be more than just a gimmick?
5 chillers running flat out with no spare capacity... How to increase your chances of critical failure 500%
Lucky they don't run a storage centre, they'd be striping all the data over 50 drives in raid 0, which let's face it, isn't really raid (the R stands for redundant after all!).
Black helicopter? Well they don't care what colour it is, they just need the huge fan on the top!
Got a boss that's too tight to pay for aircon in the server room?
Fear getting called out and expected to pull a whole weekend rebuilding everything when the drives cook?
Well do what I did...
Warn boss repeatedly. Bring it up at meetings. Make sure everyone knows you are concerned by the temperatures... Warn him that hard drive MTBF figures go completely out of the window when they are outside their recommended temperature ranges. Which they all are right now. Don't forget to explain what MTBF is!
Then after being ignored for months, backup one of the systems critical for the boss, and then murder it... Pull the live drive if you want (careful not to get caught), it's easier to just disable its networking in the OS or turn its port off on a managed switch remotely.
Suddenly he will listen to you, you can of course get the machine up and running again very quickly, you have it backed up (just don't do it too quickly, make him sweat as much as the machines have).
Investigate... Scratch your head. Poke about (milk it!)... Maybe order some spares (anything you might like at home basically!).
Might even be good for some paid overtime - schedule the death whenever you want.
You should have the aircon ordered within a week :-D
Maybe I should sign this BOFH.
"Hmm. Mountain or hill caves seems to be answer. Constant temp, little chance of flooding. So for a given set of servers, you need constant A/C, without having to worry about outside temperatures or sudden external loads, such as those from a heat wave or even the wrong kind of sunshine. So if you have overcapacity, it STAYS at overcapacity if one of the chillers goes down.
@Chris: see why underground might be more than just a gimmick?"
Sigh, some people just don't get it do they !
In these big datacentres, solar insolation is rarely part of the problem - it's a simple equation of heat in from the equipment vs heat out via the cooling system. I've just been having the same discussion with idiots at work, stick your datacentre in a cave and nothing changes in the steady state.
Yes, you'll absorb some heat into the surrounding rocks, but once they've warmed up a bit then that plays no part in the "energy in vs energy out" equation. You STILL have a system that is reliant on yoru cooling, and when the outside temperature goes up, your chiller plant has to work harder - because the performance of chiller plant is heavily dependent on air temperatures.
What a lot of the nay-sayers seem to ignore is the concept of thermal mass,
With the sheer mass of rock surrounding a subterrainian DC, there is an awful lot of heat that can be absorbed. Often to the extent of weeks or months of normal heat load. (Google specific heat capacity for more information)
Due to the large area of the rocks the heat ebbs away fairly easily into deeper rocks, especially if the rocks are damp.
If this isn't enough alone to stop it long term, then sure add some aircon, but you should able to use the cool night air more than most other DCs, since the rocks soak up the 'coolth' as well as the warmth...
The assumption that cooling underground means you're still dependent on outside air is still just an assumption. Here's some facts...
http://www.techworld.com/green-it/news/index.cfm?newsID=10667
"The coolant will be ground water and the site's temperature is a constant 15 degrees Celsius (59 degrees Fahrenheit) all year, meaning no air-conditioning will be needed outside the containers. This reduces the energy required for the water chillers, used with surface-level Blackbox containers."
Note in particular, not just the water cooling but the relevant fact that a static temperature of 15 degrees in a very large open space reduces the need for cooling anyway.
Above ground buildings are not only struggling with cooling because of the outside air, but because the buildings act like greenhouses. By locating underground, even if your cooling plant was above ground, your building is not being heated like a greenhouse. Of course there's a downside in that the temperature will not drop off in the night, but at least it's consistent.
It's a fact that places like disused mines (especially salt mines which have vast caverns), and disused bunkers, are being used by datacenter companies. Maybe they're stupid, but I suspect they've worked out the sense of it.
Though in part they're doing it to sell on security as well. Being underground is supposedly more "secure", but of course that wire sticking out of the ground is an open invitation to Mr Hacker ;)
Opening windows made me laugh - a fair few years back, I worked for a large financial services company - during the one summer i was there, the A/C broke down in the server room. No back up cooling (!!!), and it was reasonably warm outside (25C). The solution was to open all doors from the front desk through to the server room (security swipe doors included....), and open the top windows (onto the road) in the server room.
When this failed to bring the ambient temps below 50C, someone had a bright idea of putting an oscillator fan into the room to try and cool things down. My comments on the Laws of Thermodynamics, and the reason why fans work for sweaty humans and not for servers kinda got ignored - though eventually, someone had the bright idea of shutting everything down to save the hardware from going up in flames...