The Register Home Page

back to article Brain-inspired neuromorphic computer SpiNNaker overheated when coolers lost their chill

The brain-inspired SpiNNaker machine at Manchester University in England suffered an overheating incident over the Easter weekend that will send a chill down the spines of datacenter administrators. brain Brain-inspired chips promise ultra-efficient AI, so why aren't they everywhere? READ MORE According to Professor Steve …

  1. TRT

    Perhaps they should have modelled Acomys russatus? Able to operate in desert temperatures exceeding 42°C

    1. Anonymous Coward
      Anonymous Coward

      Had the same idea ... but I hear it's fully booked serving as hairpiece to the Orange Incubus ... lotsa wasted heat to dissipate from that overagitated empty skullcap in the oblong orifice.

      1. Anonymous Coward
        Anonymous Coward

        Or, they could have instead emulated the aforementioned Orange man.

        Far less neurons needed than for a mouse!

    2. kotenok2000

      Or model a Portia genus spider? They are remarkable for their intelligence, and do that wih 100000 neurons. For comparsion mouse brain has about 70 million neurons.

      https://en.wikipedia.org/wiki/Portia_(spider)#Intelligence

      1. TRT

        I had to refresh my memory of these little beasties. Remarkable things aren't they?!

        And very modern.. according to Wikipedia they implement a "cryptic REST posture", which of course is de facto now when it comes to API security.

  2. An_Old_Dog Silver badge

    Auto-Slowdown/Shutdown Systems

    ... were not fully implemented here because HVAC failures, brownouts, and UPS/line-conditioning failures never happen in the real world.

    1. that one in the corner Silver badge

      Re: Auto-Slowdown/Shutdown Systems

      Oh, be fair: it was two British Bank Holidays in a row, perfectly reasonable to expect cold weather, high winds and rain. Snow at Easter isn't unknown. They probably had emergency temperature control plans: an undergraduate poised to run in with a space-heater

      1. HuBo Silver badge
        Windows

        Re: Auto-Slowdown/Shutdown Systems

        Yeah, and lucky this didn't happen a week later during the (could've been) end of times UK heatwave apocalypse and resultant Iberian peninsula synch fail blackout armageddon! Especially if this mouse-brain neuromorph had been a GPU-driven computational fusion energy megaspace heater of doom, rather than power-sipping spike machinery ...

        Safe to say we've averted a major "New-Zealand China Syndrome" on this one, with Manchester core meltdown shooting cataclysmic jets of devastation, all over Wellington NZ. Missed it by that much!

        I hope they get the remaining 20% of the Muridae-brain beastmachine back up soon, but things could have been just so much worse imho ...

      2. Red Ted
        Go

        Re: Auto-Slowdown/Shutdown Systems

        Snow at Easter isn't unknown.

        In the UK it is statistically more likely that at Christmas.

  3. Robert Carnegie Silver badge

    "And so

    We

    Had a cup of tea,

    And - "

  4. Kevin McMurtrie Silver badge

    That must have been toasty

    Conventional heat tolerant electronics starts malfunctioning at 90 to 100 C but immediate damage doesn't happen until it's much hotter. Since the cooling fans were were running, it's surprising that all that heat didn't trigger fire suppression.

    Or maybe no fire suppression either?

    1. Mast1

      Re: That must have been toasty

      "electronics starts malfunctioning at 90 to 100 C"

      It depends where are you referencing the temperature.

      It's usually the semiconductor temperature that is more important/relevant to device life.

      In a previous employment we had semiconductor packaging sitting on a 60C heater plate, but with intended semiconductor temp well in excess of 200C with a long MTBF.

      Since we could not wait that long to measure MTBF, we ran them at higher temperatures, well in excess of 300 C where they lasted tens of hours.

      But that was a slightly speicialist operation.....

      1. andy the pessimist Bronze badge

        Re: That must have been toasty

        Commercial grade devices should operate between 0 and 70 for a long time. Multiple years.

        When the junction temperature goes above 100 c the device moves through the lifetime bathtub. Performance to spec will be poor (fails). High temperature canork for a thousand hours (HTOL) the arrhenius equation will give an estimate of the lifetime.

        Device functionality is not guaranteed above 70.

        I just test the devices not qualify them.

    2. Alan Brown Silver badge

      Re: That must have been toasty

      Every server room I've worked in has been fitted (either originally or at my insistence) with a high temp room crowbar, usually set around 40C

      It's simply a thermostat linked to the emergency stop button

      Hard shutting down the power is preferable to cooking the hardware and 40C room temp is usually 70-80 at the actual semiconductor junctions. AC is one of the least reliable things in a computer centre and making sure you can prevent destruction of potentially millions of pounds worth of hardware (and more importantly, data) is fairly important. Back in the old days an overheating room would regularly result in disk drive head crashes

  5. fg_swe Silver badge

    EE Department + 100 Pound of Material

    An RPI, and A/D Converter plus an NTC. A transistor+Relais to cut the power in case of overheating. Report temperature via TCP/IP.

    The EE guys can build it for the CS professor.

    1. the spectacularly refined chap Silver badge

      Re: EE Department + 100 Pound of Material

      I think he could do it himself, he's a hardware guy. Among other things designing the ultimate ancestor of that chip powering the Pi and he (literally) wrote the book on the architecture.

      Perhaps he understands something you don't?

      1. fg_swe Silver badge

        Re: EE Department + 100 Pound of Material

        He understands he is too lazy to build a simple overheating shutdown system ?

        1. the spectacularly refined chap Silver badge

          Re: EE Department + 100 Pound of Material

          Come back when you've built one for a million core supercomputer.

  6. Kev99

    I've seen pix of mainboards submerged in various fluids to help with cooling. I wonder it would be possible to set up atomizers / misters to spray over the components to keep them cool would work.

    1. Alan Brown Silver badge

      Kind of.

      It's known as a swamp cooler...

  7. RegisteredOnTheRegister

    Don't Panic!

    Don't worry, Steve told me (he was getting emails from the temperature sensors) and I shut off the boards and the servers... Unfortunately, the biggest contributor to heating the room is the fans on the Chillers, so when they are not chilling, they are heating. Ideally the chillers would switch off when the water temperature is too high, but they don't. So then someone else went in an shut the power off manually.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like