They should have gone for good pr0n. Much nicer to keep dreaming about for months afterwards :)
OK, maybe not the S&M sort =>
It's Friday, your correspondent is back from summer holidays and it is therefore once again time to welcome you to On-Call, our regular reader-written tales of things that went bump when off-site. This week, reader “RP” tells us of the time he was asked to fix a server just as he was about to knock off for the day. And not any …
How poignant FFS!!! FO!!
How poignant is a sentence that asks nobody about a nothingness "at all"?
You ask: "Can the "they" in question have a partial realisation?"
Something of the order of a borked hard drive do you mean?
So how poignant is a borked hard drive about a non Christian episode of pagan misappropriation, on a scale of one to one hundred?
(Damn! Now I am wondering about Existentialism vs Nihilism. Fuck! I have managed to go through the whole week without any reference to that sop David Bowie. Shit!!!)
Bloody hell swearing, massed exclamations and two Paris dick-heads in one session!!!!!!! Time I went back to bed.
Fixing stuff at day's end is risky. I have a personal rule never to push source to the repository after 3pm unless it's to my own branch. Working for several years with Americans on the same team taught me that 'push and go home' was a bad idea. Now I always give myself a couple of hours to deal with issues.
CI servers help but they don't always catch everything. Especially if (as so often seems to be the case) you're working on a legacy project with minimal to no unit tests.
And of course with legacy apps you're nearly always working where there is poor unit test coverage. You wouldn't be working there if it was well covered :-/
My motto is never to push out anything on a Thursday or Friday, especially patches or the such to servers. Or do last-minute "adjusting" at COB... that will come back to haunt you big-time!
That can wait for a Monday. Not in the mood to have my weekend ruined by some arb service going down just because a patch did something naughty and kicked Windows in the ganoonies when it should not have.
One evening I sat down to rejig the disks on a Novell server. It was a PS/2 Model 80 and I was pulling the old ESDI disks in favour of SCSIs. One slight snag was that it already had one SCSI disk and this was remaining in place, meaning that part of the old disk set was being trashed. So, no way back.
Backup was courtesy of Arcserve and an 8mm helical scan tape drive that had been playing silly buggers. I had that day's backup (which had succeeded) and I took two more full backups for that belt, braces and superglue feeling. As an afterthought I printed off the entire disk tree of the server.
New disks, format, install and up. Arcserve. Backup #1 goes titsup halfway through. Backup #2 goes titsup 2/3 of the way through. Backup #3 goes titsup 1/4 of the way through. Shit. Cold sweat.
I have the disk tree. Tape is sequential. Helical scan drives allow you to jump to the bit you want. It must be possible to get everything off this heap of crap and back onto the server. If it tunrs out that all three are corrupt at the same point anywhere, then someone up there's got it in for me.
At this point I discover a little-known bug in Arcserve that means that, when trying to restore a specific directory, the "recurse" option doesn't. Bugger. That'll be me restoring individually each and every directory on The. Entire. Fucking. Tree. One. By. One. It was a really, really big directory tree too. The bean counters were addicted to nested levels of shit that went down further than the turtles. They'd replicated the entire group's Nominal Ledger account structure as a directory tree, with a Lotus 123 spreadsheet in each directory...............and cloned that setup each year.
I wrap up at about 7:30 the next morning, just in time to see the lead analyst walk in:
"Morning TeeCee, you're early!"
"No, I'm seriously bloody late.........."
Highlight later was presenting the overtime bill to the Big Boss, accompanied by my (previously rejected) request for a new 4mm DAT drive, the latter being rather less than the former. It went through this time.
Engineer - "One day all of this will go TITSUP unless we spend some money"
Boss - "I'm not spending money on unnecessary IT like backups, DR, UPS, resilient hardware and new servers..."
Engineer - "It's all died and there is no way to get it back...."
Boss - "Take all the money you need and for fuck sake work a miracle..."
> Boss - "I'm not spending money on unnecessary IT like backups, DR, UPS, resilient hardware and
> Engineer - "It's all died and there is no way to get it back...."
Sounds like my last place. Although the last line was very different:
Boss - "Why the hell didn't you tell me we were so vulnerable?"
Me - "I did. And here is the email trail. And here is your response"
Boss - "Oh".
I have on several occasions dismissed customers because I knew they were heading down a path to destruction and refused to be sitting on that powder keg when it went off. That last thing I want is my name all over a shit-splattering. If a customer is unwilling to take his or her business seriously, then why the fuck should I?
Except for one customer. There is a reason, and never let friendship influence your business decisions. I put in extra hours and worried in over-time for this customer, even though my recommendations were regularly dismissed, and often even my demands were ignored or just promised and never fulfilled. Until one weekend when a very expensive storage device went down -- for the second weekend in a row. This time in an injurious manner. The following week was tons of pissed off people wanting to know what was going on, when it was going to be fixed, and little concern for the fact that by Thursday I had a grand total of 12 hours of sleep and this entire situation was avoidable.
That was my breaking point. I made a demand: no excuse, no bullshit, let me build this system correctly or piss off, because my job is to make sure shit runs, not absorb your stress waiting for it to fail and take your abuse when it does. Do it, or I walk, and I promise I will let every consultant I know for a hundred miles around know why I left.
Almost 12 months later my demand has not been met, they still call and I get to them when I feel like it, if I feel like it. Surprisingly, they have not tried to bring in a replacement, even though I have made it clear I start a new contract in April and they are not on my extended list for support.
Engineer: "Here is my immediate letter of resignation."
Boss: "What the fuck?? WHY?????"
Engineer: "I have had it to here (points to top of head) with your cheap ass!!! Goodbye!!!" (Leaves room, walks to desk, and clears out.)
Boss: "What the fuck am I going to do now?"
Last job I fought for 4 years to upgrade hardware. Just before I left they interviewed and hired a "network administrator to help me" even though all the developers and management was included in this process I was not. I left a few days later. So this "network admin" comes in, builds a new DC and doesn't wait for it to replicate before killing (not demoting, killing.)
You can imagine what happens next.
Some time around 2005, i visited a brand new client of the consultancy I was working for. This customer was an architect firm, doing significant projects - i'd estimate that they had around 200 architects, designers, engineers and other chargeable consultants working in the firm. Picture a significant open office environment with tons of cad machines interspersed with large scale models of resort developments and the like.
As part of the onboarding of a new customer, we audited all the basics of the environment to look for anything drastically in need of some TLC.
This particular firm had an "interesting" backup process. The main server ran a single mirrored set of drives. The backup process that had been designed and architected by the owner of the company, and self-certified IT expert, involved removing one of the mirrored drives each night and taking it home. Yep, each and every day he'd degrade the RAID set, and each morning plug the drive back in and wait for it to re-sync.
After carefully explaining to the owner why this was such a bad idea, he agreed to implement a proper backup system. I went away and priced up a a relatively modest solution comprising of a single tape drive, controller, backup software etc. Probably about $5k or so of gear + labour. Remember this guy had probably 200 architects, drafters etc working on multi-million dollar development projects.
Well, it seems he thought that $5k was FAR too much money for backup and refused to go ahead. So we made the call and effectively sacked him as a client.
Fast forward 6 months to one very very very stressed sounding business owner begging us to come and try to fix his server which had shat itself during one morning's drive rebuild. His team of architects, engineers etc were basically sitting around doing no work, and he has lost absolutely every piece of information they had ever created.
We should have helped, but... I was feeling spitefull that day. I asked him if a $5k backup system seemed cheap now, and hung up.
I was managing a mainframe site in east Anglia, the mains supply could best be described as 'dodgy' not ideal when you have 7 mainframes in a data centre. The situation had been getting progressively worse and we were experiencing disk failure regularly because of the variation in current / frequency. I'd been begging for funding for a data centre UPS, had done the research with the big guys in the industry and had come up with an £80,000 business case. It wasn't just rejected, I was literally laughed out of the management meeting.
A month later we have a catastrophic mains incident where the power dropped out and came back on several times in a couple of minutes. By the time I got from my desk into he data centre and hit the emergency power off the machines had lost and regained power half a dozen times. The ops shift leader was stood in the middle of the room memorised by the flashing lights. I finally got the go-ahead to procure a UPS from my budget but was told the procurement would have to be managed by the electrical engineer in the Property Department as he was 'the expert'. They came back crowing about how stupid I had been, how oversized the UPS I had been conned into looking at and how they had saved £20,000. The install of the UPS went well but required a downtime day while the PDU's were rewired to the new UPS - no bypass switch for us. Sure enough mains was restored and we started the IPL sequences about 15 minutes in the UPS tripped out, we lost power, then it tripped in again, then I managed to hit the EPO button, once again everyone else seemed in a stupor. In the end we had to wait for the PDU's to be pit back to raw mains before We could start IPL'ng the mainframes and work out how much damage had been does, then place the emergency calls to hardware maintainers and spend the rest of the night working with engineers to restore services for Monday morning. end result? A crisis meeting with the property team, acknowledgement that they had undersized the UPS and a statement that they wouldn't take the hit as I obviously had a bigger budget. Their suggestion was that I start up the machines in sequence which would have meant a DC restart would have taken over 7 hours rather than the 2 hours normally. In the end I spent almost twice my original budget to get the larger UPS I Had Specified in the first place, with bypass switch and more batteries to give us the chance to shut the machines down cleanly.
I distinctly remember myself and a lot of mates who played the original DOOM when it came out, started having very worrying dreams. Nothing horrendous but it was such a novelty at the time so we played it a lot, we even played it at work after hours. Despite it being very pixelated, your mind tends to smooth images and fill in the gaps and it did turn a lot of us off gaming for quite a while. The mind is a very sensitive machine and should be looked after carefully.
This post has been deleted by its author
Ah Doom. I know that it wasn't all that good, but it was so much of a leap on what we'd had before. Add in my first go on my brother's brand new 33mHz 486 DX and SVGA graphics (swish bastard! my 386 was dead to me now) and his Creative soundcard and 2.1 speakers, well this was the best technology ever! My first time using a subwoofer too.
I guess being in a house I didn't know, and having not turned the lights on helped. The first thing I noticed was the lovely satisfying sub-woofery boom, as I decapitated something nasty with my shotgun.
The next thing was the sound of a door opening. Behind me! And something stealthily creeping up! Rather than using the keys, I phsically turned round, and was thus not in a position to avoid getting my character eaten by a giant pig-creature.
Happy days. When you could fit a top of the line game, plus Windows, on a 40MB hard disk.
'I distinctly remember myself and a lot of mates who played the original DOOM when it came out, started having very worrying dreams.'
Dreams?, can beat you there..I had the misfortune to be working in London at the time the game came out, played it waay too much, realised I may be overdoing it a wee bit when wandering around Holborn one lunchtime, I kept seeing the shotgun appear every time I saw those suited Thatcherite bastards appear in the crowds...
A few years back, got heavily into Black on the XBox, gave that up as it was getting a bit disconcerting to start seeing 'little red dots' appear on various peoples heads both on my way to, and, more predominately, at work.
I got really into Need for Speed: High Stakes while at Uni and, while driving IRL, I caught myself subconsciously reaching for the handbrake as approached a sharp 90 degree bend. In a 1.0 Micra full of students.
My friend's version of this happened after playing Half Life 2 all night and going to work with no sleep; he started hallucinating some of the enemy NPCs on his screen and kept trying to shoot them with the mouse. Afterwards he had absolutely no idea what he'd actually clicked on while doing data entry for a bank.
Now this reminds me of a job that a manufacturers representative told me. They'd been requested to try and recover some data from a server which they had supplied. The owner for some bizarre reason had some software which looked for files which hadn't been used for a while and deleted them to increase available space.
Why such software exists and why you'd ever use it I have no idea, however they spent large amounts of money trying to recover the data which they'd been told wasn't recoverable.
Worked with some software like that (boss bought it because he wanted to bang the sales lady). The software at one point went through and decided "No one has touched this "/etc" directory in a long time, must not be important. Also clobbered the /backup on the file server I created (Since it had the same 'last modified' time as the /etc directory). Spent all night with duplicated of that disk trying to rebuild fstab (System had about 40 some-odd mount points, most of which had similar filesystem contents) and its Sendmail config...
Lucky for me, it wasn't too long after that happened that the sales lady finally, and completely, shot down my boss, so he became embittered and wanted any software that she sold to be deleted (he was a very emotional guy. The type that would buy an attractive woman a house if she went on a date with him).
Well, yes, that used to be standard practice when men were men and storage was expensive.
However it was customary to back the files up to tape first (low level file store) so that they could be restored if/when required. Obviously you had to keep track of which files you had removed, and where you had put them.
Clever systems such as George 3 managed it all automatically, parking a job whilst the operator loaded one or more tapes to bring the files back to disc. Ah, the joys of Tape To Tape Processing where you copied all the files which hadn't been deleted (that is, no longer required ever) over to a new tape and left all the trash behind.
It makes you wonder about the de-duplication technology. Sometimes having made a temporary copy last week is the one thing that saves the day when everything else goes wrong. If it was de-duplicated then trashing those sectors on the physical media would take out all copies.
Or is that not how it works?
VME was even cleverer as the File Management System Option would as even go through the old tapes and re-merge them without expired files, to do this would would of course need ideally 3 tape decks but a minimum of 2. It was a technical recommendation to never sell a system with less than 2 decks for minimal resilience. I lost count of the number of customer calls I took where they couldn't run the optimisation process because they only had 1 (normally slow & low cost) tape drive. There was bugger all commission for the sales guys on tape decks as they were a necessity so some fly by nights would sell The FMS option (big commission) without the capacity to back up all the data. The were driving Porches I was driving an Astra. They had normally taken their commission and left by the time this was discovered and the customer was usually refunded for the costs of the FMS licence but left with a huge risk of the tape deck failing.
Security dude, long time ago decided that he was gonna be smart and make his backups "safe" and "unreadable by hackers".
backed up the /dev/VGNAME/LVNAME objects. - the file system was reiser.
Secdude, on phone, 4am: " Hey, we need to restore our working repository on $SERVERNAME, can you help us out?"
-- 45 minutes later I figure out what the twat had done, and point out to him that a restore will be destructive and that the likelihood of recovery was damn near zero, since the backup of the device was done during working hours, and ran over 45 minutes, and had a 'warning' on completion about changed states while backing up. Ended up doing a btree rebuild and found about 90 to 95% of the data still on the drive. Strangely he departed about 2 months later ...
In the heady days of Novell 3.12, DOS 5.xx and POS software that ran on DOS...
...client of mine got a new server with a VLB IDE controller card (ooh, fancy, fancy), dual IDE HDD's and, of course, Novell 3.12
VLB was supposed to be faster with SFTII setups (disk mirroring).
Set up Novell, restore data successfully from a backup of their old server, configure all, and ship it off to the client on a Friday afternoon.
Saturday I got woken rudely by my pager doing its beepery stuff. Phoned the client, heard server shitted itself. Went there, rebooted, and got the dreaded Non-system disk or disk error message.
Took it to the office with their backup. Reinstall Novell, throw all their shit back on the server and took it back.
30 minutes later the client rang again. Same story. Took it back to the office, ripped out the fancy VLB IDE card, punched in a standard IDE controller card, reinstall all the shit, and everybody was happy thereafter.
Chucked said VLB IDE controller card into a common heap for somebody else's delight...
(Luckily for me there was no internet or internet access at that time, access being with archie, telnet and FTP... )
Well that's daft!
With Novell, SCSI disks pissed over IDE. Novell runs an elevator queue for each disk and satisfies all the read/write requests therein in one pass across the disk. This works very well with disks like SCSI where the physical layout and sector addressing are linked. It also runs an elevator for each disk, so the more physical disks in a set the better (assuming you have sufficient memory for the elevators, address hash tables and cache for all, with enough left over to run the OS).
With IDE, which disguises the geometry as something an old BIOS is likely to recognise, the drive magically translates this single pass into frantic disk thrashing.
I'd see Novell servers built with one, large IDE disk and weep........
We used to have a SysAdmin who, for some reason, on a Friday used to test the content filter at the company with whatever nasty thing he'd find.
When us developers were heads down and you heard "Here lads, look at this..." and looked up to see him with a huge smile on his face you'd know it would be something really NSFW.
*shudder* I still get flashbacks.
I do hope the Server Tycoon devs are reading these comments. It's exactly the kind of thing they should emulate.
"Your stroppy engineer has said you need to spend $1M on an optional upgrade."
"Your Server Farm has stopped working, and your clients are complaining."
"Your stroppy engineer is halucinating after spending 48hrs manually retyping the entire contents of a key disk drive. Your clients are calling their lawyers"
I remember doing an initial Netware build on a server I had no documentation on at all, with a version of Netware I had never used. Finished about 0400, called in later that day, got chewed up one side and down the other by the boss.
The next day I had to setup backup infrastructure on a unnamed tape drive, again with no documentation...
Biting the hand that feeds IT © 1998–2022