back to article Late night server rebuild led to 'nightmares about mutilated corpses'

It's Friday, your correspondent is back from summer holidays and it is therefore once again time to welcome you to On-Call, our regular reader-written tales of things that went bump when off-site. This week, reader “RP” tells us of the time he was asked to fix a server just as he was about to knock off for the day. And not any …

  1. Paul Crawford Silver badge
    Gimp

    Fool!

    They should have gone for good pr0n. Much nicer to keep dreaming about for months afterwards :)

    OK, maybe not the S&M sort =>

  2. Alister

    It's Friday, your correspondent is back from summer holidays

    Wait! What??

    Is this a rehashed article, or are you an antipodean?

    1. frank ly

      Simon is a big feather on the Australian wing of El Reg.

      1. Dan 55 Silver badge
        Paris Hilton

        Apart from the whole wrongness of Christmas being in summer in the southern hemisphere, what happens in countries at the equator? Does the northern half of the country call it summer and the southern half winter?

        1. Thecowking

          At the equator you tend to get either wet or dry seasons. Sometimes called monsoon and not monsoon.

          The more temperate seasons take a pass in the tropics.

          1. Stuart Elliott

            Obligatory H2G2 reference.

            Is that how you get "tea and no tea" ?

        2. Ken 16 Silver badge
          Coat

          Hence the poignant lyric

          "Do they know it's Christmas at all?"

          1. I. Aproveofitspendingonspecificprojects
            Paris Hilton

            Re: Hence the poignant lyric WTF?

            How poignant FFS!!! FO!!

            How poignant is a sentence that asks nobody about a nothingness "at all"?

            You ask: "Can the "they" in question have a partial realisation?"

            Something of the order of a borked hard drive do you mean?

            So how poignant is a borked hard drive about a non Christian episode of pagan misappropriation, on a scale of one to one hundred?

            (Damn! Now I am wondering about Existentialism vs Nihilism. Fuck! I have managed to go through the whole week without any reference to that sop David Bowie. Shit!!!)

            Bloody hell swearing, massed exclamations and two Paris dick-heads in one session!!!!!!! Time I went back to bed.

            1. Dan 55 Silver badge

              Re: Hence the poignant lyric WTF?

              And thus we conclude that PUI is never a good idea.

    2. dotdavid

      "Is this a rehashed article, or are you an antipodean?"

      I thought El Reg's leave policy was unusually generous.

  3. Bob H

    Self-inflicted injury, no sympathy.

    I actually started my career in the control room of a large satellite communications operator, there you get to see horrific and pornographic things on a regular basis.

  4. Lozzer292
    Coat

    Who should I cover?

    How about Micro$haft?

    1. J. R. Hartley

      Re: Who should I cover?

      The 90's called, and so forth.

      1. allthecoolshortnamesweretaken

        Re: Who should I cover?

        http://imgs.xkcd.com/comics/2009_called.png

      2. Anonymous Coward
        Anonymous Coward

        Heavy Meta

        @ J. R. Hartley; "The 90's called"

        Irony of ironies... did they ask for their "the [decade] called, they want their [something] back" meme back? (^_^)

    2. PCar

      Re: Who should I cover?

      Ashton Tate & dBase, MicroPro & Wordstar?

      1. John Brown (no body) Silver badge

        Re: Who should I cover?

        Borland

        Avalon Hill

  5. AndrueC Silver badge
    Thumb Up

    Fixing stuff at day's end is risky. I have a personal rule never to push source to the repository after 3pm unless it's to my own branch. Working for several years with Americans on the same team taught me that 'push and go home' was a bad idea. Now I always give myself a couple of hours to deal with issues.

    CI servers help but they don't always catch everything. Especially if (as so often seems to be the case) you're working on a legacy project with minimal to no unit tests.

    And of course with legacy apps you're nearly always working where there is poor unit test coverage. You wouldn't be working there if it was well covered :-/

  6. Anonymous South African Coward Bronze badge

    My motto is never to push out anything on a Thursday or Friday, especially patches or the such to servers. Or do last-minute "adjusting" at COB... that will come back to haunt you big-time!

    That can wait for a Monday. Not in the mood to have my weekend ruined by some arb service going down just because a patch did something naughty and kicked Windows in the ganoonies when it should not have.

    1. Boothy

      Them: Please provide technical approval for this production change.

      Me: When is it for?

      Them: Friday afternoon.

      Me: That's a no then, re-plan it for Mon, Tue or Wen next week, and for the morning. Bye.

      1. Super Fast Jellyfish

        Users

        You do realise you are there to support your business not get in the way?

        So yes, don't sign off changes at month end or other critical times but otherwise; smile and ask if the project is paying for overnight/weekend overtime.

        1. Anonymous Coward
          Anonymous Coward

          Re: Users

          Stuff that! You might be willing to trade every weekend for money, but others of us have more important things to do. Like having a life, and going to the pub.

  7. Anonymous Coward
    Anonymous Coward

    flashbacks

    Late night server rebuild led to 'nightmares about mutilated corpses'

    Yeah, I've had jobs like that: the bodies, my gods the bodies... <sob>

    Still, if you will use Novell Netware, that's what you get...

    1. I ain't Spartacus Gold badge
      Devil

      Re: flashbacks

      This is why you can never cancel Trident. It's the ultimate deterrent to anyone who suggests installing Lotus Notes...

  8. Yugguy

    Vines

    I am actually Vines certified.

    I used it back in 1996.

    It was actually quite good.

    1. Peter Simpson 1
      Happy

      Re: Vines

      The founder of Banyan used to be my grand-boss. When I was designing comms hardware at Data General, Dave Mahoney ran the group.

  9. TeeCee Gold badge

    Late times.

    One evening I sat down to rejig the disks on a Novell server. It was a PS/2 Model 80 and I was pulling the old ESDI disks in favour of SCSIs. One slight snag was that it already had one SCSI disk and this was remaining in place, meaning that part of the old disk set was being trashed. So, no way back.

    Backup was courtesy of Arcserve and an 8mm helical scan tape drive that had been playing silly buggers. I had that day's backup (which had succeeded) and I took two more full backups for that belt, braces and superglue feeling. As an afterthought I printed off the entire disk tree of the server.

    New disks, format, install and up. Arcserve. Backup #1 goes titsup halfway through. Backup #2 goes titsup 2/3 of the way through. Backup #3 goes titsup 1/4 of the way through. Shit. Cold sweat.

    I have the disk tree. Tape is sequential. Helical scan drives allow you to jump to the bit you want. It must be possible to get everything off this heap of crap and back onto the server. If it tunrs out that all three are corrupt at the same point anywhere, then someone up there's got it in for me.

    At this point I discover a little-known bug in Arcserve that means that, when trying to restore a specific directory, the "recurse" option doesn't. Bugger. That'll be me restoring individually each and every directory on The. Entire. Fucking. Tree. One. By. One. It was a really, really big directory tree too. The bean counters were addicted to nested levels of shit that went down further than the turtles. They'd replicated the entire group's Nominal Ledger account structure as a directory tree, with a Lotus 123 spreadsheet in each directory...............and cloned that setup each year.

    I wrap up at about 7:30 the next morning, just in time to see the lead analyst walk in:

    "Morning TeeCee, you're early!"

    "No, I'm seriously bloody late.........."

    Highlight later was presenting the overtime bill to the Big Boss, accompanied by my (previously rejected) request for a new 4mm DAT drive, the latter being rather less than the former. It went through this time.

    1. Anonymous Coward
      Anonymous Coward

      Re: Late times.

      Engineer - "One day all of this will go TITSUP unless we spend some money"

      Boss - "I'm not spending money on unnecessary IT like backups, DR, UPS, resilient hardware and new servers..."

      LATER....

      Engineer - "It's all died and there is no way to get it back...."

      Boss - "Take all the money you need and for fuck sake work a miracle..."

      1. CrazyOldCatMan Silver badge

        Re: Late times.

        > Boss - "I'm not spending money on unnecessary IT like backups, DR, UPS, resilient hardware and

        new servers..."

        > Engineer - "It's all died and there is no way to get it back...."

        Sounds like my last place. Although the last line was very different:

        Boss - "Why the hell didn't you tell me we were so vulnerable?"

        Me - "I did. And here is the email trail. And here is your response"

        Boss - "Oh".

        Me: "Bye-bye.."

        1. Anonymous Coward
          Anonymous Coward

          Re: Late times.

          I have on several occasions dismissed customers because I knew they were heading down a path to destruction and refused to be sitting on that powder keg when it went off. That last thing I want is my name all over a shit-splattering. If a customer is unwilling to take his or her business seriously, then why the fuck should I?

          Except for one customer. There is a reason, and never let friendship influence your business decisions. I put in extra hours and worried in over-time for this customer, even though my recommendations were regularly dismissed, and often even my demands were ignored or just promised and never fulfilled. Until one weekend when a very expensive storage device went down -- for the second weekend in a row. This time in an injurious manner. The following week was tons of pissed off people wanting to know what was going on, when it was going to be fixed, and little concern for the fact that by Thursday I had a grand total of 12 hours of sleep and this entire situation was avoidable.

          That was my breaking point. I made a demand: no excuse, no bullshit, let me build this system correctly or piss off, because my job is to make sure shit runs, not absorb your stress waiting for it to fail and take your abuse when it does. Do it, or I walk, and I promise I will let every consultant I know for a hundred miles around know why I left.

          Almost 12 months later my demand has not been met, they still call and I get to them when I feel like it, if I feel like it. Surprisingly, they have not tried to bring in a replacement, even though I have made it clear I start a new contract in April and they are not on my extended list for support.

      2. AndrueC Silver badge
        Thumb Up

        Re: Late times.

        Boss - "Take all the money you need and for fuck sake work a miracle..."

        Been there, done that. In a former life I was a data recovery engineer ;)

      3. Fatman
        Joke

        Re: Late times.

        You forgot:

        Engineer: "Here is my immediate letter of resignation."

        Boss: "What the fuck?? WHY?????"

        Engineer: "I have had it to here (points to top of head) with your cheap ass!!! Goodbye!!!" (Leaves room, walks to desk, and clears out.)

        Boss: "What the fuck am I going to do now?"

        1. onceuponatime

          Re: Late times.

          Last job I fought for 4 years to upgrade hardware. Just before I left they interviewed and hired a "network administrator to help me" even though all the developers and management was included in this process I was not. I left a few days later. So this "network admin" comes in, builds a new DC and doesn't wait for it to replicate before killing (not demoting, killing.)

          You can imagine what happens next.

      4. Griffo

        Re: Late times.

        Some time around 2005, i visited a brand new client of the consultancy I was working for. This customer was an architect firm, doing significant projects - i'd estimate that they had around 200 architects, designers, engineers and other chargeable consultants working in the firm. Picture a significant open office environment with tons of cad machines interspersed with large scale models of resort developments and the like.

        As part of the onboarding of a new customer, we audited all the basics of the environment to look for anything drastically in need of some TLC.

        This particular firm had an "interesting" backup process. The main server ran a single mirrored set of drives. The backup process that had been designed and architected by the owner of the company, and self-certified IT expert, involved removing one of the mirrored drives each night and taking it home. Yep, each and every day he'd degrade the RAID set, and each morning plug the drive back in and wait for it to re-sync.

        After carefully explaining to the owner why this was such a bad idea, he agreed to implement a proper backup system. I went away and priced up a a relatively modest solution comprising of a single tape drive, controller, backup software etc. Probably about $5k or so of gear + labour. Remember this guy had probably 200 architects, drafters etc working on multi-million dollar development projects.

        Well, it seems he thought that $5k was FAR too much money for backup and refused to go ahead. So we made the call and effectively sacked him as a client.

        Fast forward 6 months to one very very very stressed sounding business owner begging us to come and try to fix his server which had shat itself during one morning's drive rebuild. His team of architects, engineers etc were basically sitting around doing no work, and he has lost absolutely every piece of information they had ever created.

        We should have helped, but... I was feeling spitefull that day. I asked him if a $5k backup system seemed cheap now, and hung up.

      5. Anonymous Coward
        Anonymous Coward

        UPS power conditioning

        I was managing a mainframe site in east Anglia, the mains supply could best be described as 'dodgy' not ideal when you have 7 mainframes in a data centre. The situation had been getting progressively worse and we were experiencing disk failure regularly because of the variation in current / frequency. I'd been begging for funding for a data centre UPS, had done the research with the big guys in the industry and had come up with an £80,000 business case. It wasn't just rejected, I was literally laughed out of the management meeting.

        A month later we have a catastrophic mains incident where the power dropped out and came back on several times in a couple of minutes. By the time I got from my desk into he data centre and hit the emergency power off the machines had lost and regained power half a dozen times. The ops shift leader was stood in the middle of the room memorised by the flashing lights. I finally got the go-ahead to procure a UPS from my budget but was told the procurement would have to be managed by the electrical engineer in the Property Department as he was 'the expert'. They came back crowing about how stupid I had been, how oversized the UPS I had been conned into looking at and how they had saved £20,000. The install of the UPS went well but required a downtime day while the PDU's were rewired to the new UPS - no bypass switch for us. Sure enough mains was restored and we started the IPL sequences about 15 minutes in the UPS tripped out, we lost power, then it tripped in again, then I managed to hit the EPO button, once again everyone else seemed in a stupor. In the end we had to wait for the PDU's to be pit back to raw mains before We could start IPL'ng the mainframes and work out how much damage had been does, then place the emergency calls to hardware maintainers and spend the rest of the night working with engineers to restore services for Monday morning. end result? A crisis meeting with the property team, acknowledgement that they had undersized the UPS and a statement that they wouldn't take the hit as I obviously had a bigger budget. Their suggestion was that I start up the machines in sequence which would have meant a DC restart would have taken over 7 hours rather than the 2 hours normally. In the end I spent almost twice my original budget to get the larger UPS I Had Specified in the first place, with bypass switch and more batteries to give us the chance to shut the machines down cleanly.

    2. Disko
      Thumb Up

      Re: Late times.

      "that belt, braces and superglue feeling"... that is a great description of the kind of Murphy-defying dedication to (pre)caution needed when dealing with other people's antiques...Made my afternoon!

  10. Disko
    Coat

    "Let's get on the Internet"

    Aaaaaand your life is ruined.

    Mine's the one with the etherkiller in the left pocket...

  11. Amorous Cowherder
    Devil

    I distinctly remember myself and a lot of mates who played the original DOOM when it came out, started having very worrying dreams. Nothing horrendous but it was such a novelty at the time so we played it a lot, we even played it at work after hours. Despite it being very pixelated, your mind tends to smooth images and fill in the gaps and it did turn a lot of us off gaming for quite a while. The mind is a very sensitive machine and should be looked after carefully.

    1. Loyal Commenter Silver badge

      I'd recommend against playing Dead Space then. If Doom got to you, that will have you waking up at 3am screaming like a small child.

      1. This post has been deleted by its author

      2. Dan 55 Silver badge

        Dead Space for the Wii if you can find it has to be one of the most atmospheric on-rails shooters there is which unfortunately didn't sell because wasn't for the PS360. To be played in the dark with a zapper gun for maximum effect.

    2. I ain't Spartacus Gold badge
      Happy

      Ah Doom. I know that it wasn't all that good, but it was so much of a leap on what we'd had before. Add in my first go on my brother's brand new 33mHz 486 DX and SVGA graphics (swish bastard! my 386 was dead to me now) and his Creative soundcard and 2.1 speakers, well this was the best technology ever! My first time using a subwoofer too.

      I guess being in a house I didn't know, and having not turned the lights on helped. The first thing I noticed was the lovely satisfying sub-woofery boom, as I decapitated something nasty with my shotgun.

      The next thing was the sound of a door opening. Behind me! And something stealthily creeping up! Rather than using the keys, I phsically turned round, and was thus not in a position to avoid getting my character eaten by a giant pig-creature.

      Happy days. When you could fit a top of the line game, plus Windows, on a 40MB hard disk.

    3. Old Handle
      Devil

      That bit — think it's in one of the earlier levels — where you're going through cramped winding corridors with demons growling around you... Pretty sure I had fueled at least one nightmare.

    4. Adam 1
      1. Rampant Spaniel

        Iddqd ftw

    5. Anonymous Coward
      Anonymous Coward

      Why I avoid games..

      'I distinctly remember myself and a lot of mates who played the original DOOM when it came out, started having very worrying dreams.'

      Dreams?, can beat you there..I had the misfortune to be working in London at the time the game came out, played it waay too much, realised I may be overdoing it a wee bit when wandering around Holborn one lunchtime, I kept seeing the shotgun appear every time I saw those suited Thatcherite bastards appear in the crowds...

      A few years back, got heavily into Black on the XBox, gave that up as it was getting a bit disconcerting to start seeing 'little red dots' appear on various peoples heads both on my way to, and, more predominately, at work.

    6. This post has been deleted by its author

  12. Anonymous Coward
    Anonymous Coward

    Now this reminds me of a job that a manufacturers representative told me. They'd been requested to try and recover some data from a server which they had supplied. The owner for some bizarre reason had some software which looked for files which hadn't been used for a while and deleted them to increase available space.

    Why such software exists and why you'd ever use it I have no idea, however they spent large amounts of money trying to recover the data which they'd been told wasn't recoverable.

    1. Crazy Operations Guy

      Worked with some software like that (boss bought it because he wanted to bang the sales lady). The software at one point went through and decided "No one has touched this "/etc" directory in a long time, must not be important. Also clobbered the /backup on the file server I created (Since it had the same 'last modified' time as the /etc directory). Spent all night with duplicated of that disk trying to rebuild fstab (System had about 40 some-odd mount points, most of which had similar filesystem contents) and its Sendmail config...

      Lucky for me, it wasn't too long after that happened that the sales lady finally, and completely, shot down my boss, so he became embittered and wanted any software that she sold to be deleted (he was a very emotional guy. The type that would buy an attractive woman a house if she went on a date with him).

      1. Doctor Syntax Silver badge

        "boss bought it because he wanted to bang the sales lady"

        Maybe someone should have told him buying it was OK but don't install it.

    2. David Roberts
      Windows

      Delete unused files?

      Well, yes, that used to be standard practice when men were men and storage was expensive.

      However it was customary to back the files up to tape first (low level file store) so that they could be restored if/when required. Obviously you had to keep track of which files you had removed, and where you had put them.

      Clever systems such as George 3 managed it all automatically, parking a job whilst the operator loaded one or more tapes to bring the files back to disc. Ah, the joys of Tape To Tape Processing where you copied all the files which hadn't been deleted (that is, no longer required ever) over to a new tape and left all the trash behind.

      1. Number6

        Re: Delete unused files?

        It makes you wonder about the de-duplication technology. Sometimes having made a temporary copy last week is the one thing that saves the day when everything else goes wrong. If it was de-duplicated then trashing those sectors on the physical media would take out all copies.

        Or is that not how it works?

        1. Adam 1

          Re: Delete unused files?

          If you are deduping multiple backups together or together with production then "you're doing it wrong".

      2. Anonymous Coward
        Anonymous Coward

        Re: Delete unused files?

        VME was even cleverer as the File Management System Option would as even go through the old tapes and re-merge them without expired files, to do this would would of course need ideally 3 tape decks but a minimum of 2. It was a technical recommendation to never sell a system with less than 2 decks for minimal resilience. I lost count of the number of customer calls I took where they couldn't run the optimisation process because they only had 1 (normally slow & low cost) tape drive. There was bugger all commission for the sales guys on tape decks as they were a necessity so some fly by nights would sell The FMS option (big commission) without the capacity to back up all the data. The were driving Porches I was driving an Astra. They had normally taken their commission and left by the time this was discovered and the customer was usually refunded for the costs of the FMS licence but left with a huge risk of the tape deck failing.

    3. Alan Newbury

      We had a sysadmin like that - if a file hadn't been accessed for 3 months he'd rename it. If no-one complained after a further 3 months it was deleted (personal data files only, no.exes or source code).

    4. Alan Newbury
      Devil

      File Deletions

      We had a sysadmin like that - if a file hadn't been accessed for 3 months he'd rename it. If no-one complained after a further 3 months it was deleted (personal data files only, no.exes or source code).

  13. Alistair
    Windows

    Backups

    Security dude, long time ago decided that he was gonna be smart and make his backups "safe" and "unreadable by hackers".

    backed up the /dev/VGNAME/LVNAME objects. - the file system was reiser.

    Secdude, on phone, 4am: " Hey, we need to restore our working repository on $SERVERNAME, can you help us out?"

    -- 45 minutes later I figure out what the twat had done, and point out to him that a restore will be destructive and that the likelihood of recovery was damn near zero, since the backup of the device was done during working hours, and ran over 45 minutes, and had a 'warning' on completion about changed states while backing up. Ended up doing a btree rebuild and found about 90 to 95% of the data still on the drive. Strangely he departed about 2 months later ...

  14. Anonymous South African Coward Bronze badge

    In the heady days of Novell 3.12, DOS 5.xx and POS software that ran on DOS...

    ...client of mine got a new server with a VLB IDE controller card (ooh, fancy, fancy), dual IDE HDD's and, of course, Novell 3.12

    VLB was supposed to be faster with SFTII setups (disk mirroring).

    Set up Novell, restore data successfully from a backup of their old server, configure all, and ship it off to the client on a Friday afternoon.

    Saturday I got woken rudely by my pager doing its beepery stuff. Phoned the client, heard server shitted itself. Went there, rebooted, and got the dreaded Non-system disk or disk error message.

    Took it to the office with their backup. Reinstall Novell, throw all their shit back on the server and took it back.

    30 minutes later the client rang again. Same story. Took it back to the office, ripped out the fancy VLB IDE card, punched in a standard IDE controller card, reinstall all the shit, and everybody was happy thereafter.

    Chucked said VLB IDE controller card into a common heap for somebody else's delight...

    (Luckily for me there was no internet or internet access at that time, access being with archie, telnet and FTP... )

    1. TeeCee Gold badge

      Well that's daft!

      With Novell, SCSI disks pissed over IDE. Novell runs an elevator queue for each disk and satisfies all the read/write requests therein in one pass across the disk. This works very well with disks like SCSI where the physical layout and sector addressing are linked. It also runs an elevator for each disk, so the more physical disks in a set the better (assuming you have sufficient memory for the elevators, address hash tables and cache for all, with enough left over to run the OS).

      With IDE, which disguises the geometry as something an old BIOS is likely to recognise, the drive magically translates this single pass into frantic disk thrashing.

      I'd see Novell servers built with one, large IDE disk and weep........

  15. Anonymous Coward
    Anonymous Coward

    PTSD

    We used to have a SysAdmin who, for some reason, on a Friday used to test the content filter at the company with whatever nasty thing he'd find.

    When us developers were heads down and you heard "Here lads, look at this..." and looked up to see him with a huge smile on his face you'd know it would be something really NSFW.

    *shudder* I still get flashbacks.

  16. Anonymous Coward
    Linux

    The pair spent the whole night re-installing all the software

    Why didn't they restore from the daily backups?

    1. Anonymous Coward
      Anonymous Coward

      Re: The pair spent the whole night re-installing all the software

      One can only assume. What sodding back ups? Or did they get hosed as well?

    2. Doctor Syntax Silver badge

      Re: The pair spent the whole night re-installing all the software

      In a situation like that you can take it for granted that there's only one tape for all the backups and that the daily backup had been taken post-corruption and pre-discovery. My Murphy will explain why.

  17. Doctor Syntax Silver badge

    RP should stay well clear of forensic science as a career change.

  18. WibbleMe

    A week personality. Anyone with balls between their legs would have said to their boss unless they stay their not either.

    Its also ok to loose a client if your not paid different if they go.

    1. John Brown (no body) Silver badge
      Happy

      $deity!! Was that a deliberate troll for the grammar nazis or did you really just manage to get so much wrong in a single post?

  19. xeroks

    I do hope the Server Tycoon devs are reading these comments. It's exactly the kind of thing they should emulate.

    "Your stroppy engineer has said you need to spend $1M on an optional upgrade."

    "Your Server Farm has stopped working, and your clients are complaining."

    "Your stroppy engineer is halucinating after spending 48hrs manually retyping the entire contents of a key disk drive. Your clients are calling their lawyers"

    Etc

  20. Samizdata
    WTF?

    I remember doing an initial Netware build on a server I had no documentation on at all, with a version of Netware I had never used. Finished about 0400, called in later that day, got chewed up one side and down the other by the boss.

    The next day I had to setup backup infrastructure on a unnamed tape drive, again with no documentation...

  21. HurdImpropriety

    Was it a Banyan VINES CNS or the newer ONS ?? Idiots ran Banyan...

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like