back to article Whoooooa, this node is on fire! Forget Ceph, try the forgotten OpenStack storage release 'Crispy'

Friday has arrived once again with a tale from the smouldering world of On Call. Today's remembrance comes from "Phil", who a few short years ago found himself supporting an unnamed public cloud vendor that decided to base its product on OpenStack Grizzly. It's safe to say that it wasn't a pleasant experience. "OpenStack," …

  1. Bronek Kozicki Silver badge
    Devil

    Back when OpenStack was launched with NASA, you literally ...

    ... had to be a rocket scientist to run it

    He wasn't trolling you, was he?

    1. Anonymous Coward
      Anonymous Coward

      Re: Back when OpenStack was launched with NASA, you literally ...

      Rocket Science...

      ...it's not brain surgery is it?

      1. chivo243 Silver badge

        Re: Back when OpenStack was launched with NASA, you literally ...

        According to McCoy in Spock's Brain, brain surgery is child's play... until he forgot

      2. lowjik

        Re: Back when OpenStack was launched with NASA, you literally ...

        Rocket surgery?

        1. A.P. Veening Silver badge

          Re: Back when OpenStack was launched with NASA, you literally ...

          Rocket surgery?

          "Would you care to operate on a torpedo?"

          1. The Oncoming Scorn Silver badge
            Pint

            Re: Back when OpenStack was launched with NASA, you literally ...

            Well they did in ST: The Undiscovered Country.

            1. A.P. Veening Silver badge

              Re: Back when OpenStack was launched with NASA, you literally ...

              Well they did in ST: The Undiscovered Country.

              Why do you think I used quotation marks? The answer of Bones McCoy is even better (and I will leave that as an opportunity for the student ;) ).

          2. Anonymous Coward
            Anonymous Coward

            Re: Back when OpenStack was launched with NASA, you literally ...

            "Would you care to operate on a torpedo?"

            That's what *I* said. Then she slapped me...

        2. J. Cook Silver badge

          Re: Back when OpenStack was launched with NASA, you literally ...

          BRAIN SCIENCE!

      3. Terry 6 Silver badge
        Alien

        Re: Back when OpenStack was launched with NASA, you literally ...

        Somebody down voted that.

        I can not believe somebody down voted that.

        1. shedied

          Re: Back when OpenStack was launched with NASA, you literally ...

          I'm tempted to call him a NASA- hol* if I knew the downvoter

        2. Psmo Silver badge
          Meh

          Re: Back when OpenStack was launched with NASA, you literally ...

          It seems that some get offended by silly single-entendres and are unable to recognise that there are some that occassionally appreciate a wallow in gutter-level humour.

          They may well want to shut down most comedy and stand-up shows too.

          1. This post has been deleted by its author

        3. Anonymous Coward
          Anonymous Coward

          Re: Back when OpenStack was launched with NASA, you literally ...

          Maybe his own little torpedo was prone to misfires?

    2. El blissett

      Re: Back when OpenStack was launched with NASA, you literally ...

      My uncle was permanently drunk from his early teens to his sad early end and still managed to fit in an unrealistically successful career in rocket science - software programming for ballistic missiles.

      1. Antron Argaiv Silver badge
        Mushroom

        Re: Back when OpenStack was launched with NASA, you literally ...

        I think if my job consisted of Assuring Mutual Destruction (at least a significant part of it), I might be doing a lot of drinking as well.

        1. Anonymous Coward
          Anonymous Coward

          Re: Back when OpenStack was launched with NASA, you literally ...

          That might explain why the office automation job I applied for at the end of my degree course started with "are you prepared to sign the official secrets act" and it quickly became obvious the interviewer was from the military end of the company.

          Finding out about my copious university bar time makes more sense than a degree almost designed to turn out weapon guidance system engineers!

          1. swm Silver badge

            Re: Back when OpenStack was launched with NASA, you literally ...

            I once asked my computer science class if it was ethical to write software for nuclear missiles. Most of the class said, "No!"

            Then I asked if it was more ethical to let poorer software developers write the software.

      2. juice Silver badge

        Re: Back when OpenStack was launched with NASA, you literally ...

        > My uncle was permanently drunk from his early teens to his sad early end and still managed to fit in an unrealistically successful career in rocket science - software programming for ballistic missiles.

        In a galaxy long ago, I was housesharing with someone who'd been writing software for the eurofighter - something to do with the avionics iirc, twenty-something years on. But he'd quit and was slowly working his way down the alcohol food chain - I don't think he was quite at the two-litre cider bottles yet but he wasn't far off. Equally, I'm not sure if he quit his job because he was an alcoholic, or if the job had pushed him in that direction.

        Either way, an odd chap, especially when you factor in that he had the "charm" gene cranked to 11; he went out pretty much every night and came back with a different lady each time.

        As a young and naive kid straight out of university, I missed most of the overtones. As a older and allegedly wiser person, I'm wishing there had been a way to help steer him away from this spiral...

    3. smudge

      Re: Back when OpenStack was launched with NASA, you literally ...

      ... had to be a rocket scientist to run it

      "For every action, there is an equal and opposite reaction."

      Sounds about right :)

      1. KittenHuffer Silver badge

        Re: Back when OpenStack was launched with NASA, you literally ...

        Back when I worked for the NHS the saying was: "For every action, there's a manager who will stop it from happening"!

  2. Anonymous Coward
    Anonymous Coward

    "yeah, whenever" basis,"

    "With replacements from hardware suppliers on a "yeah, whenever" basis,"

    This made my day !

  3. Locky
    Flame

    Thermal Incident

    We had a message from out CoLo DC team that one of our racks had discharged it's non-Halon gas over night one morning.

    After a few calls to HP, the report came back for the warranty replacement kit as "To be fair, C7000 enclosures very rarely catch fire"

    1. jake Silver badge

      Re: Thermal Incident

      "To be fair, C7000 enclosures very rarely catch fire"

      Only true if you're not fully versed in the proper application of thermite.

      1. Aladdin Sane Silver badge

        Re: proper application of thermite

        Liberally and frequently?

        1. Saruman the White

          Re: proper application of thermite

          Both!

          1. Aladdin Sane Silver badge

            Re: proper application of thermite

            Remember kids, thermite and thermal paste are 2 very different things.

            1. big_D Silver badge
              Facepalm

              Re: proper application of thermite

              Now you tell me!

              1. David 132 Silver badge
                Happy

                Re: proper application of thermite

                Next you'll be trying to tell me there's a difference between thermite and Marmite.

                Fancy a sandwich?

                1. jake Silver badge

                  Re: proper application of thermite

                  I've always found that thermite makes for a dry sarnie. On the bright side, there's nothing that enough mustard can't fix (try mayo if you're tasteless, or French).

                2. Psmo Silver badge

                  Re: proper application of thermite

                  To be fair, the chances of me eating a sandwich made of either is the same.

                3. Chris King

                  Re: proper application of thermite

                  Some people hate Marmite so much, they'd probably opt for the Thermite if they had to choose between the two.

    2. big_D Silver badge

      Re: Thermal Incident

      HP kit is pretty resilient.

      I worked at one company, who thought the ideal computer room was the top floor, south facing room with floor-to-ceiling windows and no AC! In summer, the first person in the building went into the computer room and opened the windows...

      When I started, the first thing I told the CEO was that we needed AC in the room, or to move the computers into the basement. Both were vetoed, the AC budget was exhausted, because the CEO needed AC in his office and the mirrored SQL Server was already in the basement, eggs in one basket and all that... Plus, the servers had never had problems in the past.

      Yeah, because they were newer and not full of dust!

      I quickly put a thermometer in the middle of the rack. In winter, it was reading over 40°C, with an open window.

      Summer came and lo-and-behold, the temperature in the space between the servers exceeded 60°C, but there was still stony silence on the need for AC... Until one of the financial servers went tits-up - I came in one morning to screaming fans, well, screaming a bit more than usual, I have the admin-gene and I could detect it! :-D

      A quick status check and it was confirmed, the server was not responding. I forced the power off, pulled it out and waited... It eventually cooled down to under 40°C and I took the lid off, thick dust everywhere. With just a can of air, I sprayed the worst out and managed to get our external support company to come in on the weekend with an air-compressor and we went through the whole rack and cleaned all the servers.

      But even so, only one server crashed, even though the room temperature was over 40°C and the rack temperature was over 60°C. I quickly left the company and found another job. But that HP kit is tough!

      1. A____B

        Re: Thermal Incident

        Similar story in my past.

        Moved site to a "new" building (an old factory, gutted and with some newish desk). The secure server room was a windowless, box made of bricks in the centre of the building.

        When we were planning the move I added up the power inputs of the servers, networking kit etc and asked for aircon to match (on the grounds of 1kW power in needs 1kW cooling).

        A somewhat patronising refusal was given by the finance director [obviously a well respected architect/electrician in his spare time !!]. In his expert opinion, a big aircon was a luxury and a cost that couldn't be justified. Fortunately this was in a meeting and was duly recorded as part of the official minutes.

        Fast forward a couple of months when we've moved in and the inevitable happened. The server room was over heating and the aircon was struggling away, noisily dripping algae laden condensate (known locally by the charming nickname of 'elephant snot'). As others have reported in their experiences, the main engineering design servers, database server and e-mail systems were too hot to touch (and the threshold of pain is generally reckoned to be 60°C).

        Still no movement by the FD until... somehow in a server room rearrangement the finance server was moved under the aircon (wonder how that happened). After a few drips landed on it (and evaporated away quickly), it was shut down "as a safety preventative measure" (and logged in the safety incidents and risk registers) -- just a few days before the corporate quarterly return, a VAT return and a customer status report were due; strange how coincidences happen.

        The FD was a little upset but on entering "the oven" and seeing the server draped in green gunk did have the grace to admit that "perhaps we did need more air-con" and ask brazenly "how did you let it get into this state?" The old meeting minutes and printouts of his e-mails were presented to him in a folder, which just happened to be at hand. The presence of several witnesses was a great help.

        I believe that the purchase processing for the new aircon broke all company records.

        But yes, HP and Sun Microsystems [remember them?] did make some good kit that survived abuse.

        1. C Yates

          Re: Thermal Incident

          "The old meeting minutes and printouts of his e-mails were presented to him in a folder, which just happened to be at hand. The presence of several witnesses was a great help."

          Beautiful ;)

          1. Mark 85 Silver badge

            Re: Thermal Incident

            "The old meeting minutes and printouts of his e-mails were presented to him in a folder, which just happened to be at hand. The presence of several witnesses was a great help."

            Beautiful ;)

            Always know where the bodies are buried and have documentation. Hallmark of true BOFH.

          2. shedied

            Re: Thermal Incident

            Ah yes, the folder with all the precious minutes happened to be in the same case as the extinguisher, with the words BREAK GLASS IN CASE OF FIRE in bold type

        2. Anonymous Coward
          Anonymous Coward

          Re: Thermal Incident

          "The old meeting minutes and printouts of his e-mails were presented to him in a folder, which just happened to be at hand".

          Ah yes, the "I TOLD YOU SO" folder. That has saved my skin many times over the years.

          Remember: If you make a stupid decision, I WILL keep records, and I WILL make very sure that your words are carefully arcnived. I do this because I expect you will do the same against me anyway.

          1. Anonymous Coward
            Anonymous Coward

            Re: Thermal Incident

            Ah, but management sometimes do worse ( especially to other, more junior managers). They'll archive comments out of context and bring it out with a nasty twist to the meaning, when required.

            I have had several examples pulled on me over the years.

            I once said to my line manager, ( respectively head and deputy head) while chatting in the staffroom, that I didn't think we should be micro-managing our highly skilled and professional specialist teaching teams, but that they should be self-managing and we should be setting and monitoring targets for performance etc.

            This was, several years later, pulled out in a performance review as my having said to her that I didn't believe in managing our staff. This despite ( or I'd hazard because of) the fact that during that same year I'd been able to get rid of one very poor and incompetent teacher in a matter of weeks that she'd failed to be able to remove for many years; By setting and monitoring targets for performance! Because he couldn't meet them and I could prove that these were a minimum professional standard. Whereas for years they'd micro-managed this specimen, checked every dot and comma of his work for a few weeks, which he duly and temporarily complied with. Pretty easy for him to do since they'd told him exactly what to do almost hour by hour but never laid down any standards that he should achieve.

      2. Anonymous Coward
        Anonymous Coward

        Re: Thermal Incident

        A leisure swimming pool was building their new site and in planning we said "You need to put the server room on the roof space. You're next to the sea and all the water plant equipment is in the basement".

        As always, was ignored. They put said room in the basement. A month later the kit that was in there was now starting to rust because of the chlorine and sea air seeping into the room. It has been several years, they are having to replace the kit regularly.

        If only they had listened they wouldn't be consistently pissing money away.

        1. jake Silver badge

          Re: Thermal Incident

          Corrosive air can elude the thought processes of the unwary ... In the mid-80s, I was working for a company that built gear to dynamically allocate bandwidth between voice and data.

          Incredibly Big Monster of a company started getting weird bit errors on their global T1 (E1, T3 etc ... ) network. I was assigned to track down the problem after lower level techs couldn't figure it out.

          Going thru' the data, I discovered that once the problem started occurring at any one site, it gradually became worse ... It was never bad enough to actually take down a connection, but network errors ramped up over time.

          Further review showed that the same team of installers had installed the gear at all the sites with the problem.

          I flew out to Boca and discovered that they had installed punch-down blocks in a janitor's closet ... directly over a mop bucket full of ammonia water. Seems it was the only wall space that was unused almost universally in such spaces.

          Blocks relocated and corroded wire replaced, no more bit-errors ...

      3. Josh 14

        Re: Thermal Incident

        I have a (somewhat) similar story as well, though from the cellular industry.

        Working in equipment installation and maintenance, my team went on site to facilities all over the region that we took care of. One of the sites had a single HVAC unit, with the whole site located on the south facing end of the maintenance floor, next to the elevator room on top of an office tower in the middle of downtown in that city.

        Of course the single point of failure HVAC chose to do just that one summer.

        When we got into the room, ambient inside temperatures were in the 120°F range, and all the coax insulation was melting and dripping off of the cables.

        The single HVAC was promptly replaced with a pair, and operated with a fail-over load balancing controller, which was also able to report failures...

        The disaster recovery of the broadcast equipment was an interesting mess in it's own right.

      4. Dippywood

        Re: Thermal Incident

        Pray tell, why is it that, no matter where a box has been, when you de-dust it the smell is often of curry?

        1. jake Silver badge

          Re: Thermal Incident

          Here, they smell of horse. Even the ones in the machineroom/museum/mausoleum/morgue that supposedly breathe properly filtered air.

          1. J. Cook Silver badge

            Re: Thermal Incident

            And you will get a contact nicotine buzz from some of the older servers at [RedactedCo]- it's one of the last places where members of the public can smoke inside, and boy do they ever.

            1. jake Silver badge

              Re: Thermal Incident

              "you will get a contact nicotine buzz"

              No, I will not. I refuse to work on that kind of hazmat, and have done since I first started working on computers. The interior of a smoker's computer is the epitome of narsty ... Several people I know quit smoking when I pointed out that their lungs undoubtedly looked and smelled worse than the mess inside their computers.

              1. Chris King

                Re: Thermal Incident

                I've previously documented one machine that was sat on a carpet of cigarette ash. I had to work on that machine wearing a mask and thick latex gloves, and it was totally yellowed with all the nicotine.

                By "fix", I meant removing the hard disk and cleaning it outside, placing the rest of the machine in a thick bin bag, and sealing it up tight. "The fans are gunked, the processor is totally fried, and this kind of damage voids the warranty. You'll have to buy a new one and it's not coming out of our budget".

  4. Anonymous Coward
    Anonymous Coward

    Ahhhh

    Agile DevOps, why bother with a QA team? Don't you just love it.

    1. jake Silver badge

      Re: Ahhhh

      The theory I've heard from clients investigating DevOps is that QA is overhead, so who needs it? To which I respond "So is Janitorial." ... They usually get the message.

      1. Chris King
        IT Angle

        Re: Ahhhh

        Be careful what you wish for... That's right, Doodles !

  5. Pascal Monett Silver badge

    Just out of curiosity

    Could someone explain to me how is it that the motherboard was fried and the box was hot enough to cook a meal, yet the hard disks had survived ?

    How is that possible ?

    1. Tom 38 Silver badge

      Re: Just out of curiosity

      Hard drives are (usually) at the front of a rack server, with fresh/cool air pulled over them by the fans at the back. The other gubbins is all at the back of the server, and heat mostly rises.

      That's assuming it was all in one server rather than the disks being in an external enclosure (I'm guessing it was, otherwise our hero would have just plugged the enclosure in to the new server rather than moving disks.)

      1. Anonymous Coward
        Anonymous Coward

        Re: Just out of curiosity

        Also, the hottest part goes first usually (or the most delicate :P ). The RAID chips thus being the hottest part, took one for the team.

    2. Jou (Mxyzptlk) Bronze badge

      Re: Just out of curiosity

      Go back a few years and you meet 3.5 inch drive half height (i.e. double current standard height) with a normal operation temperature of 60°C when cooled, and 70°C when not. I remember those monster 4 GB SCSI/SCA drives. And those were the "modern silent" ones.

      Newer hard disks have ton of sensors and adjust them self to changing conditions since a few degrees difference means the tracks moved by the width of several tracks away from where they were just a few minutes ago.

  6. Anonymous Coward
    Pirate

    OpenStack with just four people

    So when one of them's on vacation and trekking the mountains, and another is having a baby, and another is having an epic session on the booze, and the fourth is stuck on jury service. OK, unlikely four-way coincidence, but ...

    I hope they're not all in one workplace, where any lurgy is likely to spread and knock all of them out. The human equivalent of an overstuffed rack where the hardware fries.

    1. ArrZarr Silver badge
      Unhappy

      Based upon the rest of the working environment, I think we all know what the answer is.

      Personally, this is why I'm trying to get out of an on call role - On a quiet day, it's a nice windfall for being on call, but when you have the bad days and just want to go home and cry yourself to sleep, you don't get to.

      I recognise that's the reason for the big bucks but it just makes the bad days all the worse.

      1. big_D Silver badge

        I used to work for a software company selling systems to the meat processing industry.

        A stoppage of more than 15 minutes was a six-figure loss at even medium sized companies. That meant 24/7 support (slaughter lines generally started work at midnight and finished around 8-10 in the morning).

        Getting a call for a stopped line a 3 in the morning and less than 15 minutes to analyse and getting it working again was pretty stressful. I'm glad I'm out of there and have no real on call any more.

    2. SVV Silver badge

      It's for something called "Adobe Advertising Cloud". I had no idea this existed, but had a quick look and it's as horrid as the combination of those three words sounds. After doing so, I'm hoping they are all in one workplace......

      1. Paul Greavy

        Thanks for checking into that for us.

        I think you took one for the team there.

  7. BinkyTheMagicPaperclip

    Not actually that surprised

    I admit I have no knowledge about Openstack, but I have used RAID.

    RAID is a lot less fussy these days, but even in the 90s/early 2000s it was still possible to transplant a RAID array to another system and bring it up without much hassle, although it might have been necessary to ensure firmware levels were matched. If this RAID controller is decent enough to handle dozens of drives, it probably had better firmware than some of the lower level stuff that could still be coaxed to work.

    If it's software RAID rather than hardware, whilst I'm not the biggest fan of Linux RAID, all the RAID devices are GUID based and bay position does not matter.

    1. defiler Silver badge

      Re: Not actually that surprised

      This.

      I have performed some terrifyingly rude operations on Linux softRAID, switching drives around, failed SATA controller taking out half the disks and just forcing the damn thing to rebuild, changing the Superblock version of an unmounted RAID set, you name it. It just keeps taking the punches!

      A mate of mine used a lot of eBay hardware in an academic environment, and his warranty was basically a pile of servers in a cupboard. He used MD so that he could simply haul drives from one, shove them into another, and they'd always, always boot.

    2. vgrig_us

      Re: Not actually that surprised

      It's just sign of experienced ops guy - the only thing that surprises us is "everything just worked".

      Yeah, raid controller suppose to handle foreign config, but what if battery backup on fried one didn't work? What if the heat demaged drives? What if drive fw had a bug that only showed up on new controller?

      All of that happened to me, resulting in complete reinstall (I f@#$g hate exchange!).

    3. l8gravely

      Re: Not actually that surprised

      I've run Netapp kit for 20+ years now, first using them when they still used the DEC StorageWorks containters for 3.5" drives on the old FAS720 systems. Good days! Anyway, we had a ClearCase (version control software) VOBs database on the Netapps and the system crashed with two disk failures in the same RAID group. Back then, backups were to DLT7k tape drives, and would have taken days to restore, and the company was desperate to get things working again without losing data. This was using the Netapp RAID4ish WAFL layout before they went dual parity. We had had two disk failures close enough in a row to lead to data loss.

      Turns out that one drive had crashed the heads on the disk, you could hear it screaming and grinding. The other disk has just lost the on-disk controller board, fried somehow. So with Netapp support on the line, we ended up doing a disk-ectomy, moving from the bad platter disk taking it out of the StorageWorks container, pulling off the disk controller, and putting it onto the second disk.

      Plugged the now hopefully good disk back into the array, fired it up and damn if it didn't start serving data again and rebuilding onto a spare disk as fast as it could. I was a very happy guy to see that happen.

      1. J. Cook Silver badge
        Go

        Re: Not actually that surprised

        O.O Where I come from, that's known (affectionately) as pulling a Scotty, so named after the Miracle worker of star ship engineering himself.

    4. katrinab Silver badge
      Happy

      Re: Not actually that surprised

      I've moved a zfs pool to another computer a few times, and that always works without any problems, except for one case where it didn't pick up one of the drives due to a faulty cable. Being able to get from pile of computer bits to fully working system in about 10-15 minutes is nice.

  8. chivo243 Silver badge
    Devil

    "ticket after soul-destroying ticket."

    Aren't they all?

    1. ecofeco Silver badge

      Re: "ticket after soul-destroying ticket."

      Right?!

      1. The Oncoming Scorn Silver badge
        Alert

        Re: "ticket after soul-destroying ticket."

        You haven't met my user base, but most recently.......

        Told for months that they have to use the two factor authentication for the VPN, this becomes a issue*:

        Last thing Friday afternoon (2 hour reset process).

        Sometime Friday evening

        Anytime Saturday.

        Last thing Sunday (When something needs to be for first thing Monday).

        The conversation will always include:

        What RSA token\Instructions & similar e-mails?

        Oh, was that was that was about I never set it up, I didn't think I was required to anything with it!

        I forgot my easily remembered pin number.

        I can't connect, what am I doing wrong?.

        *Especially if they have a recently replaced Windows 10 "On-Call laptop" & didn't bother logging in before leaving branch.

        1. Stevie Silver badge

          Re: two factor authentication

          Our place has set this up using a Microsoft product.

          It does not challenge on activating outlook.

          It does not challenge on remotely accessing the system.

          It does not challenge when connecting a phone to the mail system.

          It DOES challenge randomly about once every three weeks while I am in the middle of my turn at covering the servicenow tickets. If I don't respond on my cell phone (in my pocket and tangled with my keys etc) in thirty seconds it shuts down my email server connection and getting it to come back up is a journey of discovery involving clicking on links, randomly shutting down and restarting outlook and on one particularly desperate occasion clearing off a *very* busy desktop and rebooting the workstation.

          The wonderful chaos that ensues when my password needs changing is a thing of beauty too, as the email is configured to demand a change (by silently disconnecting from the server and hanging) in the middle of the day, whereas the network wants it doing at my convenience but nags me for two weeks. Password aging is the cowpat in the field of computer security. A bread and circuses approach that just makes for people gaming the password vetting algorithm and database.

          Where's the Tylenol?

  9. Smartypantz

    A bit off topic

    But a lesson to be learned is to stay away from those damned, blackbox, raidcontrollers! mdadm has saved my ass so many times, just about as many as dells damned PERC controllers has burned it ;-)

    1. JulieM Silver badge

      Re: A bit off topic

      Yeah, if you whip a drive out of an md RAID1 array and plug it into another motherboard, it just looks like a normal drive with filesystems on it, even if the partition types appear a bit suspect. Not so if you do the same with a drive out of a hardware RAID1 array.

      Been bitten that way exactly once, when a RAID controller packed up, and not touched another hardware RAID controller since. (It didn't help that it had previously tried to rebuild its array by copying a pre-emptively swapped in, fresh blank drive over the top of the good one ..... Yeah, that thing you joke about. Brown trousers time when it happens for real. I ended up dd'ing the contents of the removed drive onto another new one, like you're supposed not to have to do, so it wouldn't matter if it tried the same stunt.)

      You can also live-swap and grow software RAID1 systems, replacing both disks with bigger ones, with just one reboot (none if the boot drive is not part of the RAID).

      1. vgrig_us

        Re: A bit off topic

        Wait, what? You must've dealt with some really crappy hardware raid - my sympathies.

        I've pulled a raid 1 drive out of controller raid and booted another server with it many times... In fact - that one of the methods to install two node cluster I used to use.

        Nowadays it's easier to v2p from vm image or use proper deployment tools (bad for one's geek street cred, I think).

        1. John Brown (no body) Silver badge

          Re: A bit off topic

          "Wait, what? You must've dealt with some really crappy hardware raid - my sympathies."

          Well, yes, there were a lot of crappy RAID controllers out there back in the day. I was once sent to a Compaq server to replace a Compaq RAID controller. Sounds simple enough, but the caveat was firmware revisions. If the server BIOS was at the "wrong" revision level, the RAID controller firmware had to be upgraded from tock factory level before the server would even boot. Unfortunately, the server was a dev machine at the devs house where he worked most days and of course the server BIOS revision level was at the "wrong" level. He had a couple of desktops, but they didn't have the right type of expansion slots (E-ISA? Microchannel?). Eventually I bit the bullet and did an enforced downgrade of the server BIOS, upgraded the RAID formware, then put the server BIOS back to the original (and latest) BIOS. BIOS upgrades where sphincter tightening in those days and downgrades were always a last resort as it might not boot up afterwards.

          1. vgrig_us

            Re: A bit off topic

            Yeah, I remember those - and you couldn't download right version off Compaq site either - just had to hope of the CDs that came with each server had right version. You couldn't even get to bios without bootable cd on those - circa 1999-2000 proliants.

        2. Smartypantz

          Re: A bit off topic

          With the hardware RAID you have to trust the handfull (if lucky, probably just one guy)) who really understands the "firmware" if you are in real trouble, late at night, that is not a good place to be!. I have forced individual sectors to "act" as good with the help of mdadm and assorted disk tools, i have rescued a raid5 array by hand picking bad sectors and forced rebuilding!!...... When that shit mounted at boot and was R/W was the best moment of my professional life!!! Pheeeewi!! I could NEVER have done that shit with crippled "firmware UI from DELL inc!""

  10. cdegroot

    "Yeah, whenever"

    I ran a small ISP in the early naughts that had a similar hardware replacement policy. Once, one of our switches broke down - one of these expensive 24 port 19" Cisco thingies. I realized that a) we didn't use all 24 ports, b) we didn't yet use any of its management facilities beyond basic port monitoring, so c) I yanked the cables from the 12 port no-name switch on my home office desk, hopped in the car, swapped the switches (the no-name one wasn't rack-mountable but luckily had magnetic feet so I just attached it to the side of the rack enclosure) and went to bed. Next day, dropped off the Cisco at the office asking my admin to send it in for a warranty repair and picked up a fresh no-name switch for at home at the local PC store.

    A year or so later, I noticed a box with a Cisco brand on it in our office. The admin had forgot to tell me that the repaired switch arrived (a mere week or so after the incident) and I forgot that our little ISP was still running an important chunk of traffic on a cheap no-name switch...

    If I ever get dumb enough to start another company that actually has to spend $$$,$$$ on hardware, I'll make sure I'll have something better than an "yeah, whenever" replacement policy in place :) (I wont. Ever. The smell of data center in your clothes after another 12 hour shift standing behind a tray-mounted keyboard still makes me sick)

  11. J. Cook Silver badge

    I've gotten pretty lucky here at [RedactedCo]- The company has zero issues with throwing Serious Money at hardware and support contracts, and we also have things engineered for redundancy and HA where possible.

    Even so, I've had to pull out the Techno-necromancer's kit a few times before we go to where we currently are. Swapping CPUS around between slightly similar models of Poweredge 2950's to resurrect a failed server that blew it's mainboard during a physical hardware move was interesting, but not as fun and exciting as virtualizing one of the last SQL servers that was in a cluster before the shared disk packed it in was slightly hairy.

  12. tygrus.au

    Sometimes like a toddler

    Some computer systems remind me of a toddler: immature, stubborn, repetitive, illogical, hard to calm down...

    Me: What do you mean you want "the pink one", this IS pink!

    Toddler: I want the other pink one!

    Me: If you wore it yesterday, it's in the wash

    Toddler: But I want the pink one!

    Me: It's in the wash, how about this pink shirt

    Toddler: I want the other pink one!

    Me: I give up, I'll get the one from yesterday with the ice cream stain on it.

    1. The Oncoming Scorn Silver badge
      Pint

      Re: Sometimes like a toddler

      Partners laptop on Thursday, just before a trip (Short version - I tried a few combinations & other tricks during the tantrum).

      OK lets stick a nice 120Gb SSD in you & a fresh Windows 10 install & bump up the memory to 8Gb from 6Gb.

      Don't wanna boot - BSOD!

      Put in original spinning rust - Don't wanna boot - BSOD!

      Refit SSD & original 6Gb - OK I'll boot & start installing..... OK I'm installed.

      Bump up memory - Don't wanna boot - BSOD!

      Refit original memory - Don't wanna boot - I WANNA REPAIR NOW!

      Repair (That took longer than a fresh install) - OK then apply usual tweaks, enhancements, software & Office 2019 Pro, all working.

      Time to make backup recovery image on second partition - I don't wanna let you boot from USB or get back into the BIOS to boot off USB (Despite setting the boot menu up earlier).

      Oh fuck you then - I'm going for a beer!

      I'll make a image on its return........

    2. Stevie Silver badge

      Re: Sometimes like a toddler

      Luckily, chances are that if the pink one is in the wash, everything is pink now.

  13. ecofeco Silver badge

    Too true.

  14. Maventi
    Coat

    Two days on and still no Cinder jokes? Missed opportunity you folks (and El Reg)!

  15. Terry 6 Silver badge

    Doesn't seem to matter the industry or field

    Or indeed whether it's public or private.

    There seems to have grown up a general managerial attitude that anyone who knows what they're talking about is an adversary - or more to the point is being adversarial- and has to be resisted.

    Any budget proposed is obviously, therefore, assumed to be inflated.

    Any time scale estimated must be too long.

    Any risk noted too fussy.

    Any scheme of work too elaborate.

    So the cost estimates that go forward are made optimistic to the point of fantasy.

    The time scales are inadequate to even get the preparatory work done.

    Several key components will prove untenable due to unplanned for problems.

    And the approach (read shortcuts) taken to perform the task will end with some major functions having to be omitted - or postponed to some future horizon.

    And this holds whether it's a computer upgrade programme or building a new school.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Biting the hand that feeds IT © 1998–2020