back to article OVH reveals it's scrubbing servers – to get smoke residue off before rebooting

French cloud operator OVH has revealed how it is cleaning every server it thinks can be returned to service in its scorched Strasbourg data centres. Founder and chair Octave Klaba used his Twitter account so show off some of the work being done by the company's clean-up crew. Update March,24 6:30pm The cleaning takes time. …

  1. spireite Silver badge

    Worth saying again......

    The cloud is not infallible.......

    It does NOT excuse you of your responsibilities......

    Stop treating it as utopia, where failure never occurs, where an admin can put his feet up and do nothing

    It's a datacenter, cloud is just a fancy name, old tech in a new clothes.....

    Repeat the above 50 times.....especially upper management!

    1. Mr Sceptical
      Paris Hilton

      Re: Worth saying again......

      Damn skippy!

      "But, but, but - surely clouds just float around the sky? Did it dry up or something?" - an Exective Team, last week.

      Aaaaaand, that's why you always need someone who actually understands IT sitting on the board at every organisation that uses technology....

      Clouds: someone else's servers (TM)

      Your average user's comprehension of IT icon ----->

      1. John Brown (no body) Silver badge
        Facepalm

        Re: Worth saying again......

        And yet with massive (multi-)regional cloud failures of Office 365 and Google Docs, you'd think people might be a little more aware of it by now. But no, users just think of those incidents as "snow days". Rare but inevitable.

      2. Alan Brown Silver badge

        Re: Worth saying again......

        not to mention the twunts who think everything can be done on a windows desktop pc - and imagine places like OVH to be full of such things

    2. Anonymous Coward
      Anonymous Coward

      Re: Worth saying again......

      ... your data is in the cloud ... of smoke over there!

    3. SuperGeek

      Re: Worth saying again......

      It's the same with backups to HDD. People use just one drive, instead of double cloning.......

      1. Aus Tech

        Re: Worth saying again......

        "It's the same with backups to HDD. People use just one drive, instead of double cloning......."

        Speak for yourself. I use 4 drives, deployed to 3 different locations on the LAN, and I still worry if I have enough redundancy, even with a small but essential lot of files backed up to the cloud, that cannot be replaced otherwise.

  2. Ken Moorhouse Silver badge

    Smut

    When they've cleaned the hard drives they will reclaim a lot of space.

  3. Anonymous Coward
    Anonymous Coward

    Communication

    Nice to see that someone has the decency to try and keep people up to date.

    1. Muppet Boss
      Pint

      Re: Communication

      I like that he's showing hardware components used, like MBs (Super Micro X8DTU series), WD NVMe SSD in a SATA caddy, capacity data, racks layout and density: the sort of data that's very difficult to get from secretive cloud providers, something that High Scalability would be happy to see in their collection. Cheers for sharing this.

      1. Jon Massey

        Re: Communication

        X8DTU - wow that's a seriously vintage board!

  4. Dave Null

    This is very low-rent

    I don't know about you, but I would rather have my data hosted by someone NOT using fire-damaged servers, thank you. You wouldn't catch AWS, GCP or Azure doing this...

    1. tin 2

      Re: This is very low-rent

      I actually thought they were cleaning and getting stuff up and running for people who hadn't got any kind of resilience to have a hope of getting their stuff off?

      Although flip side, if they are cleaning so thoroughly.... is it a fire damaged server?

      1. quartzie

        Re: This is very low-rent

        Having first hand experience with smoke-besotted paraphernalia from a house that *didn't* burn down, I wouldn't power up anything that had been covered with the stuff for more than disaster recovery.

        Insurance providers know why they won't cover that stuff any more.

        1. Anonymous Coward
          Anonymous Coward

          Re: This is very low-rent

          Indeed, I don't think this is a viable strategy for anything other than a short-term data recovery before binning it. Smoke damaged kit is a write off. I wonder what input their insurers have had into this course of action.

          1. herberts ghost

            Re: Magnetic drives (except for He filled) have vent holes

            Except for helium filled drives, magnetic drives have vent holes to equalize the atmospheric pressure. If the drive comes up, replace it soon. Any prior reliability stats are void. Smoke can harm lubricants used in spindle and arm, Also interact with lubricants used on the disk surface. We won't even talk about heads.

        2. Alan Brown Silver badge

          Re: This is very low-rent

          The only thing twhich needs recovering is the content of the hard drives.

          The fact that they're cleaning these smoke-damaged systems means they intend to use them in production again

      2. Aus Tech

        Re: This is very low-rent

        "Although flip side, if they are cleaning so thoroughly.... is it a fire damaged server?"

        It's still a fire damaged server. They won't know if it will boot until they try, and if it boots, they won't know how much data is accessible and/or recoverable until they try to access the drives. The servers might be okay, if they've done a thorough clean up of them, but I wouldn't bet a pound, let alone my life on them being operational for an extended period.

    2. Anonymous Coward
      Anonymous Coward

      Re: This is very low-rent

      You make it sound like their isn't a CPU/component shortage all across the world and getting new stuff is easy. Sometimes you have to make do with the cards you are dealt.

      1. Anonymous Coward
        Anonymous Coward

        Re: This is very low-rent

        I get that chip supply is a problem, it's a very valid point, hence the up votes.

        The important thing is that it is OVH's problem. My problem would be failing over to my alternate(and not soot covered) data center in an alternate host. A related problem would be figuring out how to migrate to another provider ASAP. Both seem to be popular options these days.

        They may not be the best or the cheapest, but there are plenty of other clouds in the sky.

    3. Anonymous Coward
      Anonymous Coward

      Re: This is very low-rent

      Nope, because they would keep it a secret and you would never know

  5. Anonymous Coward
    Anonymous Coward

    MTBF

    I wonder how all this shit will affect MTBF, for the kit that has been cleaned.

    No perfect cleaning exists for IT kit after that kind of SNAFU.

    They may as well be better to bin all the kit and install all new.

    1. Anonymous Coward
      Mushroom

      Re: MTBF

      We had an argument with our insurers after a building fire where smoke permeated through the Data Centre, leaving very obvious carbon on all the kit: Our insurers wanted us to have it cleaned and reuse - we pushed back asking for a warranty from them against any future early failures (and consequent losses) - at that point they decided to buy us all new kit. Shysters the lot of them.

      1. Anonymous Coward
        Anonymous Coward

        Re: MTBF

        Insurers are gamblers that don't like to lose....

        If you crash your car, and they pay out - does that mean you won the bet?

        1. My-Handle

          Re: MTBF

          Depends. If the bet was "I bet I won't crash my car", then no.

          If it was "I bet the insurance company will pay me if I crash my car - minus excess and anything else they can wrangle", then probably.

          At best, you're breaking even.

          1. RegGuy1 Silver badge

            Re: MTBF

            No.

            Breaking even is NOT a business model. It's an insurance company.

            You (collectively) lose.

            1. Anonymous Coward
              Anonymous Coward

              Insurance

              The bet is that you will lose much more (or catastrophically) if you don't have insurance that if you do. The carrier will always make you pay more than they do in the long run, and in the event of a car crash, short of outright fraud, you still have to survive the wreck.

              So yeah, insurance is all about picking options on how to lose in the way that bothers you the least.

          2. Missing Semicolon Silver badge

            Re: MTBF

            Worse than that, irrespective of the amount of legal insurance, NCD protection, or whatever, your base premium will always increase.

            They will get their money back.

    2. Anonymous Coward
      Anonymous Coward

      Re: MTBF

      It makes a difference for sure. We had similar smoke damage in our datacentres a few years back. We cleaned everything, but some of the kit failed within weeks. Particularly the systems that were running at the time of the fire and had fans sucking lots of smoke inside. More of the kit failed within a year or two afterwards. The primary reason for cleaning the kit should be to recover any data from it, not to put it back into long term service.

    3. Antron Argaiv Silver badge
      Alert

      Re: MTBF

      Yeah, those electronics are toast. Early failure predicted. Not only is the residue corrosive, it's also conductive. And it gets *everywhere*...even under BGAs and into connectors.

      I would think the only way to refurbish smoke exposed electronics would be a bath in some inert cleaning agent. I'd still expect decreased lifetime. Probably not a good strategy for a datacenter. But, hey, give it a try and maybe it'll prove my suspicions wrong, and we'll all learn something.

      1. Anonymous Coward
        Anonymous Coward

        Re: MTBF

        "But, hey, give it a try and maybe it'll prove my suspicions wrong, and we'll all learn something."

        Well, as someone who is impacted by this incident, I'd prefer to be safe than sorry **again** !

      2. SuperGeek

        Re: MTBF

        "I would think the only way to refurbish smoke exposed electronics would be a bath in some inert cleaning agent."

        A bath in contact cleaner, or isopropyl alcohol may work, they both evaporate quickly. You just need to watch out for certain plastics with IPA, as it can damage them.

        1. Eclectic Man Silver badge
          Joke

          Re: MTBF

          So putting them in the dishwasher on the 79 minute cycle is not recommended?

          What about that nice, safe, cleaning agent carbon-tetra-chloride, would that do?

      3. Alan Brown Silver badge

        Re: MTBF

        " the only way to refurbish smoke exposed electronics would be a bath in some inert cleaning agent. "

        Soapy water, ultrasonic cleaners, IPA baths, etc

        It takes time and rapidly costs more than just replacing the kit once cascade failures are factored in

        Once of the BIG problems with insurers is when "loss adjusters" who aren't competent at their jobs end up in the position

        A classic example is a motor scooter I had 35 years ago that got knocked off its side stand and panels cracked. Ex factory they're dipped and treated so they stay the same colour for years.

        Bozo loss adjuster decided the damaged panels could be painted.

        They came back not macthing the rest of the bike.

        ALL the panels were then repainted - and because of pearl coating's different behaviour in different light angles, adjacent panels didn't match because they'd all been painted differebt ways up

        The all went back again - and 3 panels came back damaged - the whole lot had to go back again to be repainted when the replacement panels showed up and they still didnt match

        this went around several times - at one point I ghot the machine back and rejected it after 3 weeks when the panels all started going different colours under sunlight exposure

        It ended up taking 7 months and costing 4 times as much as a new BIKE ($6000 in 1986) because the adjuster decided to save $400 and "his mate was a painting expert" - who had to repaint the same machine 6 times before he did an acceptable job

        The final result? EVERY SINGLE PANEL was replaced with factory new ones - the bike shop then managed to shatter several smaller ones whilst installing them only to find out they were out of production

        I really would've liked to be a fly on the wall of that insurer, but the loss adjuster was gone shortly afterwards

        1. fajensen

          Re: MTBF

          I had one loss adjuster arguing with me about the costs of replacing 30 cm's of leaky pipe, buried under my kitchen, that they had to take apart and re-assemble, to repair the pipe, versus replacing the entire pipe run of about 3 meters, because that was probably equally rotten!

          That useless guy was *genuinely shocked and apalled* when I paid the 70 EUR extra on the spot, out of my own pocket, rather than arguing the toss, and having the craftspeople drinking tea while doing it!

    4. fajensen

      Re: MTBF

      Maybe it doesn't matter that much. We were replacing perfectly good servers after 2 years of service because of the maximisation of the tax writeoff and because the new ones used less power for more performance.

      If they have a "burn rate" that leaves maybe a year on those server boards, it will, more or less, just be business as usual.

  6. Pascal Monett Silver badge

    OVH's experience might be useful

    It seems to me that this whole ordeal would be a good time to write the manual on fire recovery procedures in data centers. I'm sure there is one, but now OVH has first-hand experience.

    It would be really nice if OVH decided to share that experience other than by tweet.

    1. Gene Cash Silver badge

      Re: OVH's experience might be useful

      Yes, it'd be nice to find out how they're supposedly cleaning the equipment, with what solvents and what process. Especially 6 months down the road when it fails... or not.

      1. Victor Ludorum

        Re: OVH's experience might be useful

        Some people have been asking Octave on twitter about the cleaning methods. Apparently it's an outside company with a proprietary 'secret sauce'.

        1. Neil Barnes Silver badge

          Re: OVH's experience might be useful

          Yabbut, it's not very good: the evidence of those two pictures rather suggests that things shrink in the wash!

    2. Corporate Scum

      Re: OVH's experience might be useful

      Only those who have experienced it have the chance to do such a thing.

      It took YEARS for a definitive write up on the acoustic shock issues with whole room Halon systems despite decade of experience with post dump equipment failures. Why? People didn't do a proper post-mortem, or it got buried instead of sharing it with the outside world.

      As it turns out, there was an issue with the dump nozzles, which no one designed to operate at a reasonable pressure level. So the system was as loud as a bomb going off and would slam things like drive heads and other shock sensitive components.

      This fire will expose similar lessons, but they have to be understood, captured, and shared.

    3. Anonymous Coward
      Anonymous Coward

      Re: OVH's experience might be useful

      "It seems to me that this whole ordeal would be a good time to write the manual on fire recovery procedures in data centers. I'm sure there is one, but now OVH has first-hand experience."

      After a fire this massive, I'd go this way:

      - activate DRP to another DC (all in)

      - bin all the systems

      - clean up toroughly the rooms

      - rebuild the shared infra (cabling, networks, power)

      - install new kits

      Frankly, given the cost, it would be better to write pro-acive manuals, rather than post mortem recovery ones. That would mean automated extinction, regular reviews by external consultants, procedures, etc ...

      "It would be really nice if OVH decided to share that experience other than by tweet."

      I would bet good money whatever post-mortem is performed, it will be top secret.

      So far, information about containers, lack of automatic extinction systems was only leaked to the french press by Police and Fire brigade sources. OVH stays tight lipped. Except blaming the poor dude who services the UPS the day before.

  7. Anonymous Coward
    Anonymous Coward

    No word on what they're going to do to try and ensure it doesn't happen again...

    Then again, if there was it would probably be boilerplate "We're taking steps" BS so....

    1. Spiz

      To be fair

      You might want to give them just a teensy bit of time to sort this lot out first. I have no doubt that analysing what happened here and future mitigation is a big priority for them.

      1. Anonymous Coward
        Anonymous Coward

        Re: To be fair

        "You might want to give them just a teensy bit of time to sort this lot out first. I have no doubt that analysing what happened here and future mitigation is a big priority for them."

        Or not at all.

        There is no way, in any DC I've approached during the last 10 years, anything like this could happen, even if you're bringing in 10 liters of Molotov Cocktail inside one room (that is, if you could make it through the sas).

        OVH has been criminally negligent, here, and the only good news is no-one was armed (could have been otherwise) and are trying to spin it like it was bad luck. It was not.

        And FFS, this was no DC, but a pile of containers !

        1. Victor Ludorum

          Re: To be fair

          The 'container' data center wasn't damaged by the fire, it was the regular building which was built 10 years ago. I guess the relevant regulations and best practice have changed a fair bit in the intervening years.

          1. Alan Brown Silver badge

            Re: To be fair

            even ten years ago nobody in their right mind put UPS systems adjacent to working server farms (or even in the same building if avoidable) due to smoke damage risk

            It's not as if there haven't been a number of spectaular UPS fires to draw examples from

    2. Claverhouse Silver badge

      Your data is very important to us.

      1. SuperGeek

        "Your data is very important to us."

        All your data is belong to us, more like, in this day and age of incomprehensible T&C's.

  8. Anon
    Unhappy

    Oh, that sort of cleaning

    I was hoping they were going to clean the malware mongerers off.

  9. sbt
    Alert

    Serious question

    Is there a point at which it's not worth bothering to try to restore the old servers and data? If your systems are down for a couple of weeks, won't you be building replacements elsewhere from scratch?

    1. Anonymous Coward
      Anonymous Coward

      Re: Serious question

      If you have backups or sufficient data to rebuild your server then you'd have done that on day 1. If you don't then although it may not be ideal, you'd want to boot up that server when you can and recover any useful information off of it or at least zero it to ensure you are covered for GDPR, after all who knows where the drive will end up. If you've not got a clue then you are probably just going to wait a couple of weeks.

      You'd be amazed at the number of companies who have moved to VOIP and how much they moan when their internet connection goes down about business losses, for the sake of a simple DR plan.

      1. SuperGeek

        Re: Serious question

        "you'd want to boot up that server when you can and recover any useful information off of it or at least zero it to ensure you are covered for GDPR"

        Run it through an industrial shredder, that will get rid of the server and data securely! Om nom nom! Crunch!

  10. Kevin McMurtrie Silver badge

    Are they dry?

    I bought a vacuum pump and chamber for personal electronics projects because it's so hard to dry out a washed PCB. The fiberglass boards, flux under components, gaps between wire and insulation, and any sealants tend to be absorbent. Lingering water will pick up salts then conduct. Lingering solvents turn things to mush. I can bake a board to over 100C and it will still blow wet flux bubbles in a vacuum.

  11. Sgt_Oddball
    Flame

    And how long do you think...

    All the kit would take to arrive? They all appear to be non-standard servers (I mean the size of the boards in those chassis looks like either a custom job or small boutique servers).

    If the kit can work long enough to be replaced, that's fine but when you need everything up yesterday and no new kit to work with.... What's a BOFH/PFY combo to do?

    Some parts might be perfectly salvageable (like CPU's being covered in stuff to stop them cooking, and maybe the ram..) but anything with a capacitor soldered on can't be trusted.

    But right now, their pressing concern is just getting something, anything back up and working. It can get canned later once the new parts rock up.

    On the flipside, last time I saw a computer that soapy it was on an episode of Red Dwarf involving a certain mechanoid and it didn't end well.. C'est la vie as they say.

    1. Eclectic Man Silver badge

      Re: And how long do you think...

      Sgt_Oddball : "the size of the boards in those chassis looks like either a custom job or small boutique servers"

      Or could be some attempt at a new 'standard' that never got very far and is no longer manufactured making replacements difficult to find.

    2. Anonymous Coward
      Anonymous Coward

      Re: And how long do you think...

      They seem to have mastered the art of distraction. If they look busy, perhaps we won't notice their lack of an effective DC DRP.

  12. Kev99 Silver badge

    Once more, everyone.

    A cloud is just a bunch of holes held together with vapor.

    A net is just a bunch of holes held together with string.

    Anyone who is stupid enough t5o store anything more valuable than the recipe for peanut and butter sandwich deserves the pain caused by the failure or theft of a cloud or net.

    1. David 132 Silver badge
      Happy

      But what if it's, like, a really, really, good peanut-butter sandwich recipe? Like, ambrosia-of-the-Gods, secret-family-recipe-passed-down-through-the-generations good? Your guidance is confusing.

      1. Anonymous Coward
        Anonymous Coward

        Dark bread, not white bread

        Thinnest layer of actual butter on the bread to keep it from getting soggy

        Actual peanut butter, preferably the peanuts and salt kind that you have to stir up once in a while, not the kind with 18 ingredients and oil from 5 other plants. If peanuts don't need it neither does the peanut butter for this sandwich. Chunky or smooth is between you and your god.

        Jam or jelly of choice, but I'd either go for the classic Strawberry or a nice Marmalade.

        As an alternate, use honey and about half a sliced bannana.

        Oh, wait, posting in a random news article's comments section counts as a "Cloud Backup" right?

        If not the rest of the Post Pub Nosh lot may want to grouse about it's heretical bread choices for old times sake. Raise a glass for the old man as you get chased back by Spanish customs. He'd have done it for you.

  13. Anonymous Coward
    Anonymous Coward

    Usually the problem is the magic smoke gets let *out* of the electronics, not that it gets put in!

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like