back to article Can gamers teach us anything about datacenter cooling? Lenovo seems to think so

It's no secret that CPUs and GPUs are getting hotter. The thermal design power (TDP) for these components is already approaching 300W for mainstream CPUs, and next-gen datacenter GPUs will suck down anywhere from 600W-700W when they arrive later this year. Liquid cooling technologies developed decades ago for use in mainframe …

  1. Loyal Commenter Silver badge

    Isn't it statingthe obvious that liquid cooling is more effective than air cooling?

    Simply from the observation that the heat capacity of water (the liquid usually used for this, plus detergents, anti-mould agents and such) is much, much higher than that of air, and it's a lot easier to get a laminar flow through a pipe than it is blowing air through a case.

    How is it only now that data centres are considering moving heat about effectively, rather than blowing air about, and then using secondary heat-pumps, in the form of air-conditioners, to cool the room down?

    Rather than being behind the curve in terms of what gaming rigs manage, why aren't they looking at hooking those refrigerant circuits directly into the racks, and onto the components that need direct cooling? Because one thing that is more effective than pumping water about is pumping a refrigerant that changes phase when it is heated, and releases that heat when it is re-condensed, usually outside the building.

    1. devin3782

      Re: Isn't it statingthe obvious that liquid cooling is more effective than air cooling?

      In case anyone is wondering I've found the best anti-mould/corrosion agent is Sentinal X100 which is what we put in our central heating systems especially as the average central heating system contains the same mix of metals, seals, plastics as computer water cooling

    2. MJB7

      Re: Isn't it stating the obvious that liquid cooling is more effective than air cooling?

      It is stating the obvious that liquid cooling is more effective than air. The more interesting statement though, is that it is becoming _cheaper_. That is a (big) change - liquid cooling for mainframes went out of fashion because it was more expensive than air cooling. (Switching from ECL to CMOS helped, a lot, by reducing the amount of heat you needed to get rid of.)

      1. Anonymous Coward
        Anonymous Coward

        Re: Isn't it stating the obvious that liquid cooling is more effective than air cooling?

        Exactly, which is cheaper, easier to install, easier to maintain, causes less problems if something comes lose, Air cooling or water?

        Water is only used if you need it, because air cooling is good enough. Air cooling doesn't destroy a rack if it leaks.

        Doesn't require each need rack to be plumbed in to the cooling system.

        If you don't need water cooling, you don't go with it, its more complex and prone to failure.

        If you don't need it, but it will save you money, then you evaluate if the cost benefits outweigh the risks.

        1. Anonymous Coward
          Anonymous Coward

          Re: Isn't it stating the obvious that liquid cooling is more effective than air cooling?

          Liquid is not just water cooling, and a good design need not be limited to a single working fluid.

          Water is cheap, and efficient at moving heat around, but lots of electronics react poorly to direct contact with it :) , and it can obviously be corrosive especially in long term use. That is why many liquid systems opt for another working fluid like mineral oil. Less likely to cause catastrophic damage in a spill, not especially toxic, and still much more efficient than air.

          You can still use water on your secondary loop, I was a fan of a bottom of rack heat exchanger with low pressure lines. If you already have AC running in the room you are probably dealing with the potential introduction of leaks and condensation/humidity, as with the sprinklers that may be in the fire system.

          (and don't think a halon zone in the room gets you off the hook, if the sprinklers kick next door, or worse upstairs, you will discover two things. Water ALWAYS wins, and it flows in surprising ways at those pressure and flow rates. Keep it low presssure, low in the room, and check the drains often though and you can efficiently move a lot of heat out of a rack)

    3. dgeb

      Re: Isn't it statingthe obvious that liquid cooling is more effective than air cooling?

      Integrated liquid cooling absolutely is a thing, but it isn't particularly common for a number of reasons:

      - Most data centres have a mix of equipment: specialised systems from vendors, a few generations of server hardware, networking gear, power handling equipment etc. Unless all of it supports liquid cooling, you need to implement an effective air cooling system across most of the datafloor anyway. At that point, adding liquid cooling is just extra expense.

      - Risky failure modes - liquid cooling everything means an awful lot of plumbing and manifolds, and a lot of connections to all the systems. Leaks are bad news. Keeping all the lines air free is also demanding.

      - Flow management - its hard for an individual system to manage its cooling demand, so you're likely to be over-cooling those systems not running at full capacity, which will reduce efficiency.

      - Air handling efficiency can be improved significantly with simple implementations like enclosing hot/cold aisles.

      - Systems often contain a bunch of components which passively use the airflow, but don't demand enough power to justify a liquid cooled variant - like RAID cards, or NICs, or SAS expanders, or... - so its difficult or expensive to make those general purpose systems work. These aren't considerations on desktop-type systems because there's still substantial passive cooling capacity to tolerate it.

      - Datacentre operators often are in a position of supporting what their users want to run, rather than dictating (even for in-house DCs, but doubly so for colocation providers).

      1. Gotno iShit Wantno iShit

        Re: Isn't it statingthe obvious that liquid cooling is more effective than air cooling?

        - Risky failure modes - liquid cooling everything means an awful lot of plumbing and manifolds, and a lot of connections to all the systems. Leaks are bad news. Keeping all the lines air free is also demanding.

        The article specifically mentioned closed loop liquid coolers and described what they are. There are no manifolds and there is no air problem because they are a sealed unit manufactured off site. In function they move a lot of heat in a small area to a larger area where air cooling can easily deal with it. They are much closer to a heat pipe cooler than they are to a pipes, manifolds & reservoir type system.

        Traditional = CPU -> Heatsink & fan

        Intermediate 'gamer technology' = CPU -> heat pipe -> heatsink & fan in chassis

        or CPU -> closed loop system -> radiator & fan in chassis

        Full water cooling = CPU -> water system -> bulk chillers outside the rack

        This article is about brining the intermediate options to the data center. The only problem IME is the fluid in a closed system does degrade over time but the timescale is probably greater than the system replacement time in a bleeding edge data centre.

    4. bombastic bob Silver badge
      Devil

      Re: Isn't it statingthe obvious that liquid cooling is more effective than air cooling?

      "pumping a refrigerant that changes phase when it is heated, and releases that heat when it is re-condensed"

      essentially mounting the chiller coil of a mini-fridge to the CPU to cool it. I was considering mentioning that, but you beat me to it.

      Although physical size and heat capacity of peltier devices is a bit limited (for now), this may be the REAL future. Combine a peltier device with some kind of liquid or "fat heat pipe" cooling system and you may be able to improve the cooling capacity even more, without the need of refrigerant phase changes and the pumps that make it happen.

      Then (maybe) the peltier device is built into the CPU package? it would eat more juice but could greatly improve cooling ability, at the expense of having to move even MORE heat out of the cabinet (but more effectively).

      1. lnLog

        peltier elements are horrendously inefficient for shifting heat, best use cases are for precision temperature control in lasers etc.

      2. Rattus
        Meh

        Re: Isn't it statingthe obvious that liquid cooling is more effective than air cooling?

        Peltier is VERY inefficient.

        That said most data centres already use it.

        It is common to point cool lasers and photodiodes this way, so expect 1 or 2 peltier devices per fibre-optic network device.

        They are great for that task (achieving sub-zero temperatures for small areas that are low power emitters themselves) but to dissipate several hundred watts, really no. furthermore the distance you can transfer heat this way is very small as well think 1 or 2 mm not 500mm or so needed in a real world environment

        /Rattus

    5. Binraider Silver badge

      Re: Isn't it statingthe obvious that liquid cooling is more effective than air cooling?

      The LVAC requirements of an average datacentre, and the cost of the power behind them is already pretty extortionate.

      There are certain areas of distribution networks that can't accomodate any more power; without building bigger busbars and connections to the transmission grid. That's a VERY expensive way of delivering any more compute over and above what's already running.

      Anything to reduce the LVAC bill is a very good thing; because the LVAC bill might well be more than the cost of running the CPU, especially in the Summer.

    6. druck Silver badge

      Re: Isn't it statingthe obvious that liquid cooling is more effective than air cooling?

      Regardless of how effective liquid or air cooling a gamers high end CPU + GPU, that heat has to go somewhere, and unlike a car your computer isn't located outside (usually). The heat from such a PC could be well over 1KW is going to heat up a small or medium sized room pretty quickly, and while that might be welcome this winter while we are shivering without gas, come next spring you'll need air conditioning - using even more energy,

  2. Anonymous Coward
    Anonymous Coward

    Think they're tackling the wrong problem

    Designers really need to be looking to lower power consumption in new generations of chips, not increase it.

    1. Anonymous Coward
      Anonymous Coward

      Re: Think they're tackling the wrong problem

      Even if the overall power doesn't increase, having a smaller die so your watts per unit area go up causes issues.

    2. Rattus
      Holmes

      Re: Think they're tackling the wrong problem

      They are. Power consumption has dropped considerably for each operation [1]

      The reason that servers consume more power is that they are doing considerably MORE work in the same time frame, so whilst power consumption per operation is fractional to what it once was, the number of operations a me is able to perform in the same time-frame has increased faster [2]

      Whilst the computing efficiency has gone up, we are also asking for more computations to be carried out in the first place - perhaps we should be asking is the task we are wanting all these machines to perform really needed or wanted?

      /Rattus

      [1] I will not say per instruction, because some instructions,on some architectures consume more power than others - although that one instruction may be more complex and be equivalent, and more efficient than the several instructions that would have been required on a preceding generation machine.

      [2] We have now gone massively parallel, in that a fully loaded system can have 196 or more cores on a single 'chip' able to carry out the equivalent work in a 1U box that 20 years ago would have taken the entire rack or more...

      1. Anonymous Coward
        Anonymous Coward

        Re: Think they're tackling the wrong problem

        20 years ago... O.K. but, what about 10?

        Moore's Law Guess was wrong but what isn't wrong is the law of diminishing returns. Recently the phone market has felt this hard.

    3. Anonymous Coward
      Anonymous Coward

      Have to solve for both

      As others have pointed out, if you build lower power parts they jam more of them in a rack. At the end of the day you want to be as efficient as is cost effective on both fronts, though DC's tend to lag on implementing new methods that cross the rack/server chassis barriers. Both due to conservative thinking and a desire for modularity, datacenter spaces with tenants are less likely to spend money on changes that impact the inside of a chassis than the building support systems and rack level stuff.

      So hot/cold aisle optimizations dictate what servers face which way in their racks a mostly internal decision, a liquid cooling loop means talking to customers and selling them on having to buy from a much smaller list. It's a different story for in house resources in many cases as the infrastructure team can make there case to less stakeholders. In most cases the power to the DC and the existing racks is pretty much maxed out already, so efficiency gains in the air-con side mean a laissez-faire gain for compute. Considering the thermal losses in many locations are pretty large, this is a major lost opportunity for the industry for the last few years.

      Closed loop systems will let some of those sites continue to squeeze gains out of their racks without changing the building infrastructure, and many of the closed loop systems don't use water, minimizing the spill risks. Most are likely to be maintenance free for the life of the motherboards and cpu's.

  3. Caver_Dave Silver badge
    Boffin

    Cost is key

    I used to work in the rugged board industry.

    Think, conduction (board edge) cooled, -45 to +85C ambient temperature, with many of the current generation high end processor and graphics devices.

    Your water/refrigerant/etc. is contained in the solid rack holding the boards and so is kept well away from the electronics.

    However, this all costs the sort of money which restricts the markets to those that can bear the expense.

    My high-end home PC is all conduction cooled (with off-the-shelf parts) and solid state discs - so silent. :-) (about 20% more cost than a noisy behemoth!)

  4. TRT Silver badge

    I have seen a total immersion cabinet at a trade show. That was pretty impressive! Needed to keep the coolant exceptionally high resistance though - exceptionally pure. Had a massive specific heat capacity, though.

    1. ArrZarr Silver badge

      There are mineral oil PCs around that work the same way. The Specific heat capacity, however, is only as good as the way to get heat out of the liquid if you're running the hardware at full speed all the time.

      That's the downside to liquid cooling - you can run hotter for longer but if you oversaturate the ability of the cooling system to get heat out of the coolant, then you're still stuck at exactly the same bottleneck as with the spec cooler that came out of the same box as your CPU.

      1. TRT Silver badge

        *IF*

        It is, however, easier to sort that kind of thing out. Extra chiller units, for example. And it's easier to spot problems like that in advance, instead of getting "hot spots" in an air-cooled datacenter. It also makes your DC far less noisy by getting rid of the fans.

    2. Inspector71

      What's old is new again. The Cray supercomputers had immersion cooling systems from the 70's on. I seem to remember that the Cray 1 had a benchseat as a cooling system accessory.

  5. Pascal Monett Silver badge

    "you'd run out of power before your rack is half full"

    Okay, for me that begs the question : how many 1U units can a rack hold ?

    Because if 2KW units are put in a 42KW rack, it should hold at least 21 of them.

    But if 21 units is less than half the rack's capacity, that would mean that a rack can hold in excess of 42 units.

    Did I get that right ? How tall are those racks anyway ?

    1. dgeb

      Re: "you'd run out of power before your rack is half full"

      Rack heights from 42U to 48U are normal. Taller than that exist but are rare for server racks.

  6. Anonymous Coward
    Megaphone

    We're there now

    With Nvidia having problems with its high power graphics cards melting the power supply cables/connectors ( https://www.theregister.com/2022/10/25/nvidia_rtx_4090_so_hot/ ) data centers need to address this now.

  7. martinusher Silver badge

    Yet another novel round thingy...

    When I graduated as a newly minted Mechanical Engineer in 1970 I needed a job and found one working at the local computer company as a thermodynamicist. Cooling was a big deal in mainframes of that era so had to be designed in. The job itself didn't last long -- like a lot of leading edge ICL projects it was soon canceled and being a loose engineer I was put to work designing logic (because that's what all engineers did, right?) and the rest was more or less history.

    I've always maintained that the PC was an architectural disaster setting back the development of computers for decades by disguising architectural shortcomings of computers using cheap and relatively low power parts. The PC Revolution just took a simple architecture and refined it over and over. Everyone forgot the problems because the parts worked more than adequately and we were making gobs of money. People ignored problems by the incantation of "Moore's Law" and max out whatever generation of computer they were using, kicking that can down the road because there was always going to be more road. Its only people like gamers who started to notice the limitations because unlike the rest of us, people that really don't need supercomputers to write office communications, they need every cycle they can lay their hands on so when the processor throttles back to avoid becoming a molten glob they don't really notice.

    Liquid cooling is an obvious choice. Its not that convenient but its how high power electronic installations work. A broadcasting station will often have a cooling pond somewhere, maybe disguised as an ornamental lake complete with fountain. But ultimately we just have to find a way of doing our computing without using huge amounts of power.

    1. TRT Silver badge

      Re: Yet another novel round thingy...

      Or making better use of low-grade waste heat. District heating for example. Agriculture.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like