Log in Sign up

# Hyphens of mass destruction: When a clumsy finger meant the end for hundreds of jobs

Welcome back to Who, Me?, The Register's weekly dip into the suspiciously bulging mailbag of reader confessions. Today's tale of mainframe madness comes roaring in from the 1970s, courtesy of a reader whom we shall call "Jed". Jed was a computer operator at what was then a large automotive and aerospace component manufacturer …

## COMMENTS

1. #### One way to prevent accidents

#:(){ :|:& };:

Read twice and if I am sure, [home][delete][enter]

1. #### Re: One way to prevent accidents

Interesting comment.

2. #### Re: One way to prevent accidents

And if you set your shell prompt to start with a comment symbol (appropriate to your particular shell), it makes copy-and-paste a whole lot safer.

Ideally set it in a different colour (or a different background colour) so you realise that it's not part of the command too.

3. #### Re: One way to prevent accidents

: Invalid function name

I know, bash ignores that restriction that functions start with only specific characters (alphabetic, but may be alphanumeric). This is one of many reasons why you should never write shell scripts to run in bash. Use a scripting shell.

4. #### Re: One way to prevent accidents

This is an argument for programming languages to enforce a style guide and refuse to run programs that are well-formed but in poor style. (Ideally, one rule of this guide would be "Stop taking the piss and use some fucking letters you muppet!".)

5. #### Re: One way to prevent accidents

A paraphrase of the eternal carpentry rote: measure twice, cut once.

6. #### Re: One way to prevent accidents

[home][delete][enter]

Bah. Esc-0-x-Enter.

This religious war was brought to you by the letters V and I.

1. #### Re: One way to prevent accidents

Or even v and i, for the pedants among us.

2. #### SCO Unix

I worked at a major bank and was in the area that supported the staff superannuation fund (billions of dollars under management). We had inherited a SCO unix system from a take over. One of the operators was a little less careful than he should be and on more than one occasion "inadvertently" entered an incorrect command. As this system was only used by a few staff, and had no access outside the operations room, the login management was a "little slack". Mostly the operators logged in as root. The instructions were to periodically clean out the temp folder using the "rm * -r" command (of course after changing the current working directory to the temp folder). One day the fat-fingered operator was logged in as root and entered the command (as he had done frequently in the past). The only problem was he had forgotten to change the working directory and was in the root folder.

The only good thing to come out of this was the complete system backup, that had been taken just before he deleted everything, had been successful and we were able to restore the system (DR procedures tested - Check. No DR test required that year).

We changed the root password after the system was restored and wouldn't give him the new password.

1. #### Re: SCO Unix

I'm fairly certain that everyone who has ever used *nix in anger has an inadvertent recursive delete story of one sort or another. It's almost a rite of passage.

You DO have current backups, and an on-going backup plan, right? Yes, you ... I'm talkin' to you, the guy who just thought "he can't possibly mean me!".

1. #### Re: SCO Unix

Yeah, I guess the most common one is "Oh, I'll just delete all of these hidden files" (which are all beginning with a "." under linux (and BSD afaik, possibly on real unixes {or unices??} as well?).

Finding out that the pattern ".*" also matches ".." the hard way...

1. #### Re: SCO Unix

Of course it's that way on "real Unix". Where do you think Linux got it from? I suppose that a bit of pottering about in the future will change it sooner or later; everything else seems under threat.

1. #### Re: SCO Unix

Wow, just realised I read that as "suppose that a bit of Poettering about in the future will change it sooner or later" as it still made sense....

1. #### Re: SCO Unix

I assumed it was an intentional pun.

2. #### Re: SCO Unix

I assumed, since its true, that it was intentional put-down.

2. #### Re: SCO Unix

What about trying rm *.o, but leaving the shift key down for slightly too long after the "*"?

It becomes rm *>o - definitely not what was wanted. I ended up left with just an empty file called "o".

1. #### Re: SCO Unix

That reminds me of the Unix/shell section in ye (verie) olde classic "How to shoot yourself in the foot using any programming language" :

 % ls foot.c foot.h foot.o toe.c toe.o % rm * .o rm: .o: No such file or directory % ls % 

2. #### Re: SCO Unix

In my case it was mv rather than rm but with much the same effect. The fly in the ointment (apart from the fact that it was my client's production box) was that the vendor of had installed the SCO OS and included a non-standard driver. I can't remember whether it was for the multi-port serial card or the disks. Whatever it was we didn't have a copy of it, we couldn't reinstall without it and spent much of the next day waiting for one to be emailed. Once we got that it only took a short time to get up and running again.

3. #### Aliased rm

I'm fairly certain that everyone who has ever used *nix in anger has an inadvertent recursive delete story of one sort or another. It's almost a rite of passage.

Where I worked for much of the '90s, our sysop knew better. He aliased various 'dangerous' system commands to protect users from ourselves. Hence "rm" became "rm -i".

Whether that saved anyone from a nasty accident is not recorded. My suspicion is it's more likely to have caused accidents, when someone who has learned on the job that rm asks for confirmation finds out the hard way that that was non-standard. But that wouldn't be on the BOFH-in-question's turf.

For those of us who already knew the standard rm, it was just infuriating. I just overrode all such aliases in my .rc. If I wanted an alias, I'd use something that wasn't a standard command name.

1. #### Re: Aliased rm

It's something tht used to be advised back......was it really that long ago? Where have the years gone? I want them back.

1. #### Re: Aliased rm

Where have the years gone? I want them back.

Do you speak Arabic by any chance? You're dangerously close to having quoted an Arabic 80's hit.

2. #### Re: Aliased rm

"someone who has learned on the job that rm asks for confirmation finds out the hard way that that was non-standard."

That's one of the reasons I don't overly customise my own home systems. I like to stay conversant with the "normal" way of doing things. It's too easy to get into habits on one system and find they don't work, work differently or even do dangerous things on other peoples systems.

I'll usually set the usual -i aliases (mv, cp, rm) for myself and try to make sure it's in our standard profile. Many of the people on our system are irregular users following someone else's instructions. And sure, backups, but it's not a great use of people's time to be restoring things and work lost = $people *${time since last backup} can be large even for frequent backups if $people is big. Personally I use enough systems that don't have it that I don't rely on it, it's just a safety line, if I really want -i I'll do it explicitly and if I really don't want it I'll \rm. I think the real hazard of -i aliases is that people who don't know what they're doing get into the habit of using -f to turn it off instead, usually -rf even when they don't need it, because that's what somebody else has shown them. Finding instructions that routinely tell people to use -rf and wildcards... it's best to find other ways for them to do it. The trick is to make sure you get to these people and rm -rf the -rf culture before it spreads. 4. #### Re: SCO Unix "I'm fairly certain that everyone who has ever used *nix in anger" *** REDUNDANT STATEMENT AT LINE 1 *** Anyone who's ever used Unix has used it in anger. 1. #### Re: SCO Unix "Anyone who's ever used Unix has used it in anger." Nonsense. For example, see the vast majority of Mac users. Or MeDearOldMum & Great Aunt, both happy Slackware users. Many other examples exist. 1. #### Re: SCO Unix "the vast majority of Mac users" From what one hears of falling H/W & S/W standards maybe a lot do use Macs in anger. 2. #### Re: SCO Unix I don't know about that, jake. My wife and daughter are die-hard Mac fans, as were many of the academics I knew back in the day. I'm pretty sure I've heard each of them cussing out the machine once in a while. Fact is, pretty much any non-trivial tool used often enough will eventually get on the user's nerves, deservedly or not. And fond though I am of UNIX,1 it certainly has its infelicities. 1Though not of MacOS. Whenever someone asks me to help them with something on a Mac, the first thing I do is open Terminal so I can use the OS the way God intended. 1. #### Re: SCO Unix I take it you've never heard the expression "fired a shot in anger"? In this context, "in anger" roughly means "a professional doing their job, no matter how distasteful". 5. #### Re: SCO Unix I was using a work station with a *nix os on it and my somewhat accident prone friend on the one next to me recursively deleted the root directory and entertained me by screaming at it to stop for about 10 minutes before revealing what he was wanting to stop. A running club at lunch he shot passed me and round a corner, I came round the corner to see a large pine tree rocking backwards and forwards having flung him into a ditch. After a shower and needle removal we retired to the canteen where he managed to collect his large lunch (we ran a long way even with me laughing) and somehow slid the tray under the cashiers till depositing his lunch all over the floor. Once we'd finally finished and were returning to our offices as we approached his I commented on his appalling luck today and he turned to me to speak and smacked into a great big red fire extinguisher hanging on the wall just outside his office.and slid to the floor. As did I and crawled the last 10 yds to my own office on hands and knees completely incapable of anything other than uncontrollable laughter. 1. #### Re: SCO Unix Some days your biggest mistake is getting out of bed. 2. #### Re: SCO Unix Sounds like karmic opposite of the "Woman finds out she won the lottery, 10 minutes after learning she'd beaten cancer" story I read in the news yesterday. Yin, yang.. 6. #### Re: Fat Fingers dd if=/dev/sda1 of=/dev/sda See the error? I learned a lot that Halloween, on how one should never try to manually clone partitions to a new SSD... In my defense, it was wrangling an NTFS that was "corrupt/dirty" according to all the linux tools I had access to (clonezilla, gparted live), and that windows utilities said "nah, wrong size mate" to. It only cost me the one true working copy of that authentec fingerprint reader driver for Windows 7... 1. #### Re: Fat Fingers On the upside... A colleague made a related dd error during cutover to a new drbd cluster, copying block device mount points rather than file system mount points. The resulting growingly peculiar behaviour culminating in carnage, led to me twigging that drbd is just a tee and hence me cracking the entire installation&datacentres out of its 1x2 constraint into Nx2 infinitosity. The IT version of accidentally dropping mouldy bread crumbs into a petrie dish... 2. #### Re: SCO Unix Did everyone else use the new password afterwards? 3. #### Re: SCO Unix In my experience, "rm * -r" was a standard upgrade command for the versions of SCO UNIX inflicted upon me. Even when it wasn't awful, it managed to find hardware faults no other OS at the time cared about. Or maybe it was just terrible drivers. And all this pre-dated SCO's legal shenanigans. 1. #### Re: SCO Unix Someone (who lost access very quickly after) managed to do a chown -r from / Took quite a bit of work to go and undo all the damage - compared to an equivalent system. Unfortunately, it was a customer facing server so couldn't do a full restore from backup. Fortunately, it wasn't me that had to do it (or was the one that caused it). 1. This post has been deleted by its author 4. #### Re: SCO Unix There will be lots of stories like this from the *nix types. Mine was on a solaris box in my early sysadmin days. I was adding some disk capacity and in the process of putting a file system on typed newfs /dev/dsk/c0t0d0s3 rather than c1t0d0s3 and so watched my /usr go down the plughole. Fortunately it was not yet in production but it was an interesting experience watching it slowly fall in a heap before I rebuilt it. I learn't to use disksuite after that. 1. #### Re: SCO Unix SDS didn't make you immune to stupid mistake syndrome. I had beautifully defined all of my meta devices and was mirroring root. metattach d0 d2 d1 was my downfall. d1 was from the build and it was dutifully wiped clean with the end result being a rebuild. Thankfully, preproduction and it was just an afternoon's work. 2. #### End of life care I worked for Sun at the time they died. Shortly after, Oracle and I parted company, and I had to return my chunky Sun workstation. But first, remove all private ssh and pgp keys that had been used on it. Hack up a utility to zero a file before deleting it, and run with recursive find on sensitive directories. And on the whole of /home for good measure. Oh, yeah, better do /var/ as well. And ... did I ever put anything under /root/ ? Of course it had been running zfs, so that wasn't enough. Ho, hum. Boot from another medium and zap the filesystem from low level with dd to the device; ship it back with a fresh bare-bones install on a repartition-and-newfs (which from memory was not OpenSolaris but FreeBSD - a minor exercise of the inner BOFH). Feel a low-level bereavement for the workstation. Now even if it falls into the hands of someone evil, I'm not a high-enough-value target to merit searching for the ghost of any residual data. 3. #### Re: SCO Unix I learn't to use disksuite after that. And now work a's a greengroc'er? 5. #### Re: SCO Unix Ah, rm -rf... jftr: chmod 600 -R (or something similar) will work just as fine. 6. #### I'm kind of surprised No one ever thought to have the rm command check the current working directory, and if it was / ask for confirmation before executing? Those few lines of code would have saved a lot of asses over the years. Linux using /root as root's home directory probably did as well. 1. #### Re: I'm kind of surprised These days you need to provide "-no-preserve-root" for doing that. - what you suggest has been in rm for years! 3. #### George 2+ Back in the early '80s I worked for a large company that ran two ICL 29xx mainframes. Supervising these machines we had a Senior Operator who specialised in mixing up the George 2+ console commands "GO 25" and "GO 27". One of these commands restarted a stopped job from the point where it was stopped, the other restarted the stopped job from the beginning. The knock-on effects of doing this near the end of the two day payroll run, were spectacular to say the least. 1. #### Re: George 2+ Could you fill us in on the difference between the two commands? Are the numbers a job ID or some ICL command? 1. #### Re: George 2+ Flags, I would assume, like "kill -9" 2. #### Re: George 2+ From memories more than 40 years old (and thus subject to neuronic disintegration), there's a missing program name in the quoted commands. "GO #ABCD 25" would cause: - an external interrupt to be sent to the OS; - program #ABCD to be suspended; - the next instruction address to be overwritten by the address identified as 'entry point 25' in program 'ABCD'; - the suspended program would then be restarted. Like I say, 40-year-old memory. I stand to be corrected. 1. #### Re: George 2+ GO #ABCD 20 - start the job with paper tape input GO #ABCD 21 - Start the job with punch card input GO #ABCD 25 & 27 ibid GO #ABCD 29 - abort the job (usually) - yes, under George II+ you had to type GO when you wanted things to STOP. The number after the program name was the address of the starting command in the compiled program. Some programs had a "go type entry block" so you only ever typed GO #ABCD. I've never found out why. On the 1900 series OU #ABCD 8 was useful too, as word 8 was the program counter. Good for finding forever loops (if the speaker warble wasn't indicative). Though all those commands were actually EXEC commands as I remember them. George was a set of modules you could load (or not) to get a job done with minimal interference from the operators. The "steering lines" (aka JCL) were a drop-through action list. Dead simps. So you operator would type something like: FI#XKYE#GEOG 1 (find the input spooler in file GEOG and assign it to unit number 1) blahblahblah INPA R WAS XKYE XKYE HALTED TR 1 FIX (the spooler had renamed itself and stopped pending the loading of a paper tape and the operator pressing the green button on the reader) GO#INPA 20 blahblahblah INPA HALTED HH (the spooler has finished and is ready for the next job) FI#GEMA#GEOG (load the central control module from file GEOG - ours was called GEMA after someone's girlfriend) GO#GEMA ABCD R WAS GEMA (your ABCD program is now running) ABCD HALTED HH (and it just finished) FI#XKZE#GEOG 3 (Load the output spooler and assign it to unit 3 - the line printer in this case) blahblahblah OUTA R WAS XKZE OUTA HALTED LP3 FIX (operator loads paper and presses the green button on the printer) GO OUTA 25 (I think, it has been over 40 years since I did this for real) (EARSPLITTING BANGBANGBANG NOISES FROM THE BARREL PRINTER) OUTA HALTED HH (the print run is finished and the operator, ears bleeding, can tear off the report) If you needed more room for ABCD you would DE#INPA before you GO#GEMA'd. There was something of an art to juggling the memory. I remember one consultant we had who would lobotomize EXEC on the fly from the console so they could finish in only 3/4 the memory they needed. Ah the happy sound of the print drum of the console Westrex slamming into its lid so it snapped like and angry crocodile when the machine "went illegal" because someone had typed DE#ABCD when there was less than two kilowords of core memory unused. Ah the happy sounds of the cursing operator typing like mad when a stupid compilation mistake did the same from a program test run. Worse days. 1. #### Re: George 2+ "GO #ABCD 29 - abort the job (usually) - yes, under George II+ you had to type GO when you wanted things to STOP." So that's what inspired the designers of a certain operating system to hide 'Shut down' under the Start button! Thanks for enligthening us! Deserves a -> 2. #### Re: George 2+ Wow. I think you've definitely earned your Platinum 1K Lifetime Elite geek card there. (either that, or just know how to use Wikipedia/Google, but I'm going to give you the benefit of the doubt!) 1. #### Re: George 2+ There's a Wikipedia page on George/Exec? In the name of ICL, why? Do they have "DA ERROR [" listed? Somewhere I still have my Apps Manager Programmer's Card. It is pink. 2. #### Re: George 2+ I once lost the whole week's rents run output with an incorrect ON OUTA 27 command.. 3. #### Re: George 2+ "The knock-on effects of doing this near the end of the two day payroll run, were spectacular to say the least." Did everyone get paid twice? Did you consider letting the result stand? 4. Asking for trouble when a programming language can interpret an operator in two different ways - as both 'Minus' and 'To'... 1. Don't two minuses make a plus? 1. You're thinking of two wrongs making a right or three lefts making a right.... 1. Two wrongs don't make a right, but two Wrights made an aeroplane! 2. Three lefts only make a right in a boringly flat geometry. 3. #### Sopwith Camel user interface > three lefts making a right A tad off-topic, but the opposite of this was true for the Sopwith Camel: three rights made a left. Whilst on observation patrol, (many) pilots turning left 90deg would instead turn 270deg right. Took the same amount of time and they improved their horizon/surrounds scanning markedly. 1. #### Re: Sopwith Camel user interface A tad off-topic, but the opposite of this was true for the Sopwith Camel: three rights made a left.[...] Took the same amount of time Interesting. Torque effects from the propellor rotation? 1. #### Re: Sopwith Camel user interface "Torque effects from the propellor rotation?" It's called "gyroscopic couple". It's not just the prop, the rotating mass included the cylinders/pistons, valves, etc. Essentially, the crank was fixed and the engine+prop rotated around it. Fascinating answer to a design problem, if a trifle on the bizarre side to today's eyes. They don't make machinists the way they used to ... ANYway, yes the Camel rolls into a right easier than a left. But not to the extent in the urban legend, that is likely caricature. In reality, the effect is only around 20% different left to right ... To the experienced pilot, it is almost unconsciously compensated for after a few minutes of flight[0]. When you think about it, if the Camel had been known to not be able to roll left, do you really think it would have become known as one of the best dog fighters of WWI? [0] That is directly from the horse's mouth ... Javier Arango, who owned B6291 (the last surviving flyable Sopwith-built, Gnome powered Camel that saw action in WW1) was a friend of mine. I believe he and a writer for one of the flying magazines (maybe Pete Garrison?) actually wired the plane and put numbers on the flight characteristics, but I don't have a reference handy. 2. >Don't two minuses make a plus? Sum-one will be along soon to say... 3. #### Optional Yes, but you first have to rotate one by ninety degrees, then overlay it on the other. Hmm. As well as a joke alert icon, I think we probably need a bad joke alert. :) 5. #### AS400 issues In a previous incarnation I was a developer at an org where everything ran on a chunky AS400 system. We had two smaller ones for development and deployment /acceptance testing. One day one of the devs submitted a request for a sysadmin to clear down and reinstate one of the business streams on the acceptance server so it was definitely identical to the live server before a significant code release (not an uncommon practice). Sysadmin dutifully logged in with the appropriate superpowers, typed the appropriate commands and hit enter (and confirmed anything that needed confirming). About 5 seconds later abject panic descended as they realised they'd done it to the live server and obliterated that entire strand from digital existence. A couple of hours of checking, planning and flappy director wrangling, one restore from the previous evening's backup, plus 10 minutes of reapplying journalled db changes for the processing since the backup and everything was back to working. Procedures were then changed, including the colours of the terminal sessions for live and pre-live boxes, along with which logins had the power to do those things to the live server, and ensuring the passwords for those accounts on acceptance and live were not the same passwords! Nobody lost their job over it, and the financial implications were effectively the cost of 4 mid-level folks for 3 hours, plus maybe a bit of downtime for 20 or so end users. 1. #### Re: AS400 issues More importantly, the improved procedures and color-coding of sessions likely ensured that confusing what they were working on would have much less a chance of happening, thus protecting critical data from untold horrors in the future. 2. #### Re: AS400 issues I would agree with Pascal on this - there was most likely a financial net benefit as by failing fast in this fashion, you probably prevented a much larger disaster later downstream. Assuming the right lessons were learned, of course. 3. #### Re: AS400 issues including the colours of the terminal sessions for live and pre-live boxes, Over the years a lot of colleagues were surprised I did that, but most of them copied it in the end. 1. #### Re: AS400 issues I do it to this day. 1. #### Re: AS400 issues Same here. I also start the prompt with the name of the machine in flashing green text on a black background, then the path in colours to further indicate the machine. 1. #### Re: AS400 issues AS/400 doesn't allow you to change the prompt. 1. #### Re: AS400 issues Have you filed a bug report or a feature request? 1. #### Re: AS400 issues Neither, at need I can program around it to change the prompt, but it requires a couple of API calls and a specially written shell program, preferably in MI for speed. 4. #### Re: AS400 issues Works for Windows too. Apart from setting the BG-Color: I use this cmd_admin_shell_color.cmd on some servers. Save it anywhere, run it. All your future admin shells will be red. ----------------------------------- reg add "HKCU\Software\Microsoft\Command Processor" /v "Autorun" /t REG_SZ /f /d "%~dpf0" > NUL whoami /groups | find "S-1-16-12288" > nul if not errorlevel 1 ( color 4f ) 5. #### Re: AS400 issues flappy director wrangling Sadly I cannot give you an extra upvote for this alone. 6. Not quite so spectacular, but it certainly produced unseen results: Back in the early 70s I was still at college (same place now calls itself a University!), and our "hands-on" computer was a PDP-8 - with a whole 4K RAM expansion - running FOCAL. FOCAL was in some ways similar to BASIC, which itself was almost unknown back then. FOCAL enabled up to four teletypes to share the processor, with each having access to 1K of memory. Programs were stored on punch-tape. Programs were started by running the "GO" command, and as we were all struggling with the basic concepts, frequently resulted in the program either hanging or getting stuck in a loop. One of our lecturers introduced us to the "GO?" command, which produced a step-by-step trace of the program as it ran, accompanied by much clattering from the teletype. Note the use of the phrase "THE teletype". This was because running "GO?" caused all the other teletypes to come to a grinding halt until the trace was complete! The first few times we ran the trace command, this led to the other users cursing and swearing, slapping the teletypes, and inspecting their own programs to try and see what they had done wrong. It wasn't long before the more intelligent users realised that ONE teletype was not only still working, but doing so at a frantic pace! This led to a full and frank exchange of views between them and the offending user, and the use of "GO?" was restricted to times when there was only one user on the system! Happy days.....! 1. (same place now calls itself a University!) You make this sound like a name change made on a whim for marketing purposes. A college has to offer a sufficient range and quality of courses, postgraduate and research as well as undergraduate, to qualify and it also has to earn degree-awarding powers. It's a rigorous process of qualification and assessment which usually takes five years or more to attain, even once the minimum academic provision is in place. A successful result culminates in the grant of a Royal Charter and the right to call itself a university 1. All of which was done as part of one of the various pushes by several govts to increase the number of University places, not least under Bliar. In other words, marketing schemes. 2. You make this sound like a name change made on a whim for marketing purposes. A college has to offer a sufficient range and quality of courses, postgraduate and research as well as undergraduate, to qualify and it also has to earn degree-awarding powers. It's a rigorous process of qualification and assessment which usually takes five years or more to attain, even once the minimum academic provision is in place. A successful result culminates in the grant of a Royal Charter and the right to call itself a university That is to say, it's an expensive and difficult change made for marketing purposes. 2. +1 for the FOCAL reference! My first programming language... 7. robocopy's /MIR (mirror) command can be very damaging if you dont get the paths right, due to the deleting extra files part. 1. An awesome tool. 20-ish years old, and functionality has stayed pretty much identical bar a few tweaks here and there. It's still my go-to choice for transferring large amounts of data from volume to volume, and holding onto the log files has definitely saved my skin more times than I can remember. ("Some of my files are missing from that data you migrated 3 months ago. They must not have got copied properly...") But yeah - I have got burned once or twice too... 1. Yep still use it myself. It's not restricted to the 255-odd character limit of Windows Explorer either so gets used when moving deeply nested directories around! Also had bad experience of /MIR :( 1. ++ for robocopy. I always keep a formatted command ready to go just for the mirroring issue - too dangerous to not have it all thought out ahead of time. It's all wrapped up in a nice PowerShell script with variables for source/destination directories, logging, etc. I really fell in love with it when I worked at Microsoft years ago as a contractor. One of my fellow workers didn't really understand the limits of the drag and drop of explorer. During a migration of a rather large file server (one that ,no kidding had financials and code from as far back as the 80's), the job kept failing. He kept cursing and trying again, but to no avail. I mentioned how he was doing it wrong and picked it up. One confirmed Robocopy script later and it was well on its way - along with the Z switch. Pretty sure I made an enemy that day, but I got the job done. 2. Robocopy is still the only tool that can handle the recursive appdata "compatibility" links in a userprofile when deleting it. Method: robocopy c:\empty-dir c:\users\user-to-kill /s /e /purge /r:0 /w:0 Nowadays the explorer can do it most of the time, but there are still enough cases where only robocopy can kill, like path-too-long-to-delete directories. 8. I had been working on a COBOL program. I took a backup of the original code (file.cobol) and worked on the normal file.cob version. After 2 days of coding and testing, the thing was finished, so I went to delete the backup (file.cobol) with del file.cob;* Hmm, anyone see the problem? I had to redo the 2 days of work again, although it went much quicker the second time, because I knew what I had to do, no more testing different approaches. Another time, I was working on an OLAP system (Arbor/Hyperion Essbase). It had real problems recalculating a hypercube, it was much quicker to export the bottom rows, clear the cube, import the bottom rows and calculate (4 hours as opposed to 48 hours without clearing). So, clear database... Whoops! ARRGH! Luckily we had the previous backup from 6 hours earlier. My colleague told me to just put in the old backup and recalculate and blame the missing data on the users, a real PFY! I went to the head accountant, explained the problem, we then loaded the previous export, replayed the audit log and, in the end, we lost 2 transactions. 9. #### Nostalgia ain't what it used to be... "Things worked more efficiently back then," There's definitely some rose-tinted glasses being applied here ;) To be fair, I was barely a glint in my parent's eyes during the mainframe era - though I did grow up with a ZX Spectrum. And I appreciate that mainframes were mightily powerful within their context (and rightly so, for the amount of money they cost to maintain). But also, by modern standards, they're crude, clanking machines which needed a host of Tech Adepts maintaining them, and which required arcane rituals which must be precisely following to get anything out of them, as this very story highlights. And I doubt they'd be able to handle the amount of data we routinely throw about today, or that they'd be able to usefully visualise any of it in anything like realtime. Dipping into the cliche allegory bag, a mainframe is a bit like a 1930's sports car. In theory, they're a match for even a modern motorcar, with a high BHP and the ability to bomb around a racetrack at 130mph or more. In practice, they were mechanically fragile, cost a fortune to build and maintain and have little in the way of modern conveniences like windscreen wipers, power steering, synchromesh gearboxes, automatic chokes and the like... Personally, I'll stick with my modern laptop and (not so) modern Honda Accord ;) [Monday morning pre-caffeine grumble complete] 1. #### Re: Nostalgia ain't what it used to be... Mainframes are still around today for large TP loads. On the other hand, think about that "paultry" mainframe back then, then ask yourself, how many hundred users can work at the same time on a modern PC, which has hundreds of times the theoretical power of those ancient machines... Modern servers are based around PC technologies, not high throughput technologies used in mainframes. Which is one reason a lot of banks etc. still use them. They are still faster and more cost effective for some scenarios. Think of it more like a sports car or a big rig. The sports car can get you from A to B faster than the big rig, but if you have to transport several tonnes of data, the big rig will still get there first, whilst your sports car is zipping backwards and forwards transferring small amounts of data at a time. 1. #### Re: Nostalgia ain't what it used to be... Not ust mainframes. In the 1980s I had one Unix box running several terminals on a Z8000 with 768K and then moved to look after another which must have had a similar processor but more disk space. We ran out of space on that one and had to fit a second 168M disk. Somehow back then a workload could run on H/W resources less than a boot loader would use today. Bloat! 1. #### Re: Nostalgia ain't what it used to be... And 1990s, we used to run up to seven users on an 8086 box.. When 286 became available, we could run a whole office from one. "BOS" operating system, serial dumb terminals, each user could have multiple sessions open. Fun days. 2. #### Re: Nostalgia ain't what it used to be... I did some back of an envelope math: the Terminal emulator we use for connecting to the mainframe is 24*80 characters in size. If you assume each character has 4 bytes behind it (symbol, colour, etc), the space for "IO buffer" at the mainframe end is about a megabyte per hundred users. The emulator itself is taking up about 5 megs of RAM on my desktop. You really can' exaggerate the sheer degree of difference in focus. Yes, they're both computers, but yon AS400 is there to crunch numbers and a sliver of overhead deciding who's numbers to crunch next, where as my desktop is using a relative kings' ransom worth of ram making the task bar slightly translucent. 1. #### Re: Nostalgia ain't what it used to be... If you assume each character has 4 bytes behind it (symbol, colour, etc), the space for "IO buffer" at the mainframe end is about a megabyte per hundred users. Make that no more than 2 bytes per character and that only for double byte characters. Normally it is one byte per character and the codes for colour, high light and underline are in the seemingly blank space before each string of characters. Add another 100 bytes max for indicators and that makes the total less than 2 KB per user, less than double that in case of double byte systems. This number increases a bit when you set the screen size to 27*132 (my preference), but even that is less than 7 KB max per user for double byte systems. 2. #### Re: Nostalgia ain't what it used to be... I'll happily agree that modern mainframes are still several orders of magnitude more powerful than a standard x86 server. OTOH, this article (or at least the person telling the story) was comparing a 1970s mainframe to a modern PC. I think the big rig versus sports car is a decent analogy for a contemporary comparison, but I'm not sure it works as well for a historical versus modern analogy. (I did debate whether a car analogy was best for this. Perhaps a better one would have been a steam engine versus a modern diesel generator. You could drive the steam engine onto a field, and do anything from ploughing a field to threshing grain or running a dynamo to power a fairground ride. OTOH, you'd have to manually reconfigure it for each task - and you'd have to stoke it up to operating temperature and constantly monitor it to make sure oil/coal/water levels keep topped up. Conversely, you'd just deliver a modern diesel generator to where it needs to be, press a button and plug whatever appliances are needed into those nice, modern, IP66 outdoor socket... ) > On the other hand, think about that "paultry" mainframe back then, then ask yourself, how many hundred users can work at the same time on a modern PC, which has hundreds of times the theoretical power of those ancient machines... I don't think it'd be hundreds. I suspect it'd be thousands (possibly tens of thousands) *if* we were dealing with the workloads a 1970s mainframe dealt with :) The key thing here is that workloads have changed. Where a legacy mainframe was a timesharing system which effectively focused on one task per timeslice/user (to vastly oversimplify), a modern server can simultaneously serve thousands of web connections, while dealing with encryption overheads and running other things like databases. To pick a fairly arbitrary first-search-result example, here's someone benchmarking web server performance back in 2016, using a 12-core Xeon running two VMs with varying numbers of cores assigned to them. https://www.rootusers.com/linux-web-server-performance-benchmark-2016-results/ The throughput scaled pretty lineally from 1, 2 and 4 cores, until it topped out somewhere around 6-8 core level, which makes me suspect the box was hiting some physical I/O limits. Either way, it was pushing somewhere around 140,000 requests per second, with up to 1000 concurrent connections. And while that 12-core Xeon is still a fairly beefy bit of kit, a quick rummage on Ebay suggests you can pick up something similar from around a thousand pounds. Finally, I'd note that System 370 emulators exist. And to quote the FAQ... http://www.hercules-390.org/hercfaq.html "Classic IBM operating systems (OS/360, MVS 3.8, VM/370) are very light by today's standards and will run satisfactorily on a 300Mhz Pentium with as little as 32MB RAM. Anything more up-to-date, such as Linux/390 or OS/390, requires much more processing power. Hercules is CPU intensive, so you will want to use the fastest processor you can get. A 2GHz Pentium, preferably with hyperthreading, will probably provide acceptable performance for a light workload" Said FAQ was last updated in 2010; a decade later, it's not unreasonable to assume that a modern consumer-spec laptop running this emulator could handily outperform a real 370 despite the emulation overheads... 1. #### Re: Nostalgia ain't what it used to be... >I did debate whether a car analogy was best for this. Perhaps a better one would have been a steam engine versus a modern diesel generator. Was at a tractor pulling event some years ago, for whatever arcane reason. Lots of highly-tuned and impressively powerful trucks and tractors running on diesel. They pull a sled that has a moveable weight on it and functionally becomes increasingly heavy/difficult load, with the idea being to pull it the longest distance. After watching awhile as each of these heavily modified machines pulled the sled varying distances and occasionally exploded in interesting fashion, someone decided to hook a late 1800's steam tractor to it. While the trucks and tractors made a fair chunk of speed but not infrequently failed to go the full distance, the huffing and puffing steam engine simply ignored the load, putted to the end of the track slowly, and proceeded to turn around and return the sled to the starting point as if there was nothing behind it. I don't think it ever went much above walking pace, but it was a very interesting comparison, moving like God himself could not slow it down. Quite the display of using a cheetah vs an elephant! 2. #### Re: Nostalgia ain't what it used to be... When the Dartmouth time sharing system was being developed ~1963 there was a need for hardware to handle multiple teletype connections. Only the GE DN-30 had the ability to connect ten's of teletypes with standard, off-the-shelf hardware. This was a communications processor and probably had a lot to do with the eventual success of the project. 2. #### clanking machines which needed a host of Tech Adepts "But also, by modern standards, they're crude, clanking machines which needed a host of Tech Adepts maintaining them" There's nothing crude about modern mainframes - the technology under the skin is ahead of the x86 world, modern mainframe CPUs process the vast bulk of COBOL and JAVA code natively in hardware, they also have hardware encryption and compression instructions performing these tasks way, way faster than x86 - they also run at 5+Ghz out the box. As for staff, at the site I currently work at there are fewer than 10 mainframe technical staff - the mainframe provides the bulk of processing for the large organisation. There are over 2,000 x86 staff of one form or another. The mainframes consume about 60Kws of energy - the combined x86 - well over 5000Kws just to provide pretty front-ends to the mainframe quietly getting on with delivering the core business with five 9s reliability. As for "throwing data about" The I/O bandwidth of a single modern IBM zServer is over 800Gb/second, like all IBM mainframes from the 1960s onwards, I/O is performed off the main CPUs - these days on 5+Ghz assist processors. I/O on most x86 metal interrupts the actual CPUs wasting cycles of the (already slow) CPUs and damaging cache hit rates by switching threads. With zHyperLink enabled, latency to read the disk subsystems is under 20 microseconds (yes MICROseconds) - roughly 10x faster than good FICON response and smashing anything available on any other platform to atoms. Of course IBM DS (disk) arrays can be all flash with over a terabyte of cache - so even if you need to do actual I/O - it's mindbogglingly fast. If configured correctly, network I/O between z/OS and or z/VM clusters (operating system images) is done in memory bypassing the network altogether. It a nutshell - you don't know what you're talking about. 1. #### Re: clanking machines which needed a host of Tech Adepts - Hardware memory compression - Hardware memory encryption - Memory protection Keys - RDMA over Converged Ethernet (RoCE) Every time I want to see what is coming in the x86 world I look at new features as each IBM z mainframe appears! 1. #### Re: clanking machines which needed a host of Tech Adepts "Every time I want to see what is coming in the x86 world I look at new features as each IBM z mainframe appears!" I don't suppose you saw Meltdown and Spectre there. 3. #### Re: Nostalgia ain't what it used to be... To support the same workload today, you'd "need at least an 8-core 64GB server", Jed observed. So...any reasonably spec'd laptop or desktop. Or, if you want to go nuts, any high spec engineering workstation. Couple of xeons or epycs, some RAM and off you go. Likely, to support a modern version of these workloads, you need a farm of servers, each handling mail, application loads, file services, etc. Data needs are, naturally, much larger these days. And performance is much faster. Also, one user doesn't have the ability to accidentally screw the rest of the office over due to a mistake. 4. #### Re: Nostalgia ain't what it used to be... The old mainframes were all about efficiency. Those 100 users would be using green screens in block mode - one network transfer of a few hundred bytes to generate a transaction, not using a web interface with the massive overhead of https and all the crappy scripting that the application GUI has been laden down with. Same with IO - no adhoc query capability and direct ISAM access is way faster (for the very specific task required) than an, almost always badly written, generic database application. 1. #### Re: Nostalgia ain't what it used to be... In the late 80s we supported about 2,000 online (CICS and TSO) users plus 10,000+ email users (PROFS) on a uniprocessor IBM 3090 with 32Mb memory - plus all the usual batch, payroll, accounting etc. 2. #### Re: Nostalgia ain't what it used to be... Being involved in data entry, I have long wondered why I could get more performance out of a PDP11/70 than a Sun Enterprise server with 32 virtual CPUs.<p> After a careful investigation I discovered that, while a picture is allegedly worth 1,000 words, it probably consumes something closer to 64k words, and delivers the equivalent of one word. After removing all the Icons from my data entry screens, and replacing them with a single (bold) word, they now go "faster than a speeding PDP11".<p> Icon - no you can't! (Its behind you). 3. #### Re: Nostalgia ain't what it used to be... If you concentrate on the data and not the GUI/'marketing' then its amazing what you can do with stuff. I used to help write and run a web site and back end for 30,000 customers and 300 in house PCs on 3 hefty windows pentium boxes and a big IBM box for the erp DB in the late 90s. Out if interest I pissed about on a PiZeroW and found I could run the web and DB on it with similar latency to the windows/ibm setup for similar user loads for up to 100 customers using the system when memory started to run out. I dont think we ever had that many customers online at once on the old setup. Not a perfect comparison but it did make me wonder WTF my own desktop is doing with all that CPU it seems to use! 1. #### Re: Nostalgia ain't what it used to be... Most of the time my home desktop is doing almost nothing, typical workloads being pretty trivial to today's hardware. The last few applications I've written for home use have been severely IO limited, and not CPU or memory limited. The CPU only really gets to stretch it's legs when gaming or encoding video, the latter task it does happily at 4k resolution in roughly real-time. 5. #### Re: Nostalgia ain't what it used to be... "Things worked more efficiently back then," he added. Would have helped if some of that efficiency had been put towards detecting input errors and range checks. ;) 1. #### Re: Nostalgia ain't what it used to be... "Would have helped if some of that efficiency had been put towards detecting input errors and range checks. ;)" It was. That's part of what made it so efficient. 10. This type of error is easily mitigated by Paranoid Programming - code the loop back condition test as <= rather than !=. It's a long time since I did BAL, but I don't think there would be a speed penalty incurred on the comparison. 1. Or perhaps asking confirmation, with an extra indication/warning on the number of jobs that were going to be deleted 11. #### That was a pretty impressive setup The specs on that machine were pretty awesome for the time. Lucky lads for working on such a beast :). and <pedant> "Flak" not "Flack". "Flak" is a German abbreviation for anti-aircraft weapons or fire. "Flack" is a shit UK television show. Your subbie deserves some flak for missing that. </pedant> 1. #### Re: That was a pretty impressive setup Actually flack is perfectly acceptable as an alternative spelling of flak these days according to both the Cambridge and Oxford dictionaries. 1. #### Re: That was a pretty impressive setup Well, they're wrong then. 1. #### Re: That was a pretty impressive setup Ve kould teach zem a lesson vith ze business end ov einer 88mm Ausfuhrung 36... That will teach them. 2. #### Re: That was a pretty impressive setup Yeah, but it is still wrong ;) - for a given value of wrong. 3. #### Re: flack is perfectly acceptable Not to me in this context. Rooted in spelling ignorance. Perfectly acceptable when talking about Roberta (other Flacks are available). And don't get me started on underway or (dis)orientated... 4. #### Re: That was a pretty impressive setup Actually flack is perfectly acceptable as an alternative spelling of flak these days according to both the Cambridge and Oxford dictionaries. Dictionaries are descriptive, not prescriptive. A dictionary may tell you that some word is used with a certain spelling and a certain meaning so that you will be able to understand that word when you encounter it. The dictionary does not thereby endorse that usage, and you should not assume that you can do the same without world+dog thinking you're an idiot ... maybe you can, but there's no guarantee. There is no 'c' in "Flugzeugabwehrkanone" (or "Flugabwehrkanone", or Fliegerabwehrkanone", or any of the other variants you may find); the conventional abbreviation for it is "Flak". A lot of English-speaking people don't know the etymology of Flak, and think it should be spelt "flack". Most people understand them ... that does not in itself make the usage acceptable. "Flack" might be viewed as an Anglicization of the German abbreviation, which might make it OK. Ultimately, it's a matter of taste. 1. #### Re: That was a pretty impressive setup and think it should be spelt "flack". Spelt flak or spelled flak? icon --------> Ultimately, it's a matter of taste. I do like a nice spelt loaf :-) 5. #### Re: That was a pretty impressive setup Supposedly, from name of Gene Flack, a movie agent. According to that infallible source, the internet. 12. I think I've missed something here...if the protagonist deleted all the print jobs, why were the printers suddenly going beserk? I thought there would have been a deafening silence? 1. Because back then the logs were dumped to a dedicated printer and it was suddenly killing off thousands of jobs and printing a line for each one on a fast, loud dot matrix or daisywheel printer (I suspect the former). 1. It would have been a line printer, and hell of a lot louder than a dot matrix! 1. An impact printer of some sort. A drum printer, band-printer or chain printer --- daisy wheels were late innovations for low-volume printers, and dot-matrix printers came even later. Early dot-matrix printers were very loud ---- by earlier chain printers were much louder! 2. Yep the console log was usually printed out in real time in those days! 13. #### thin fingers, small brain I remember working with someone on z/VM where they had carefully created synonyms: E for Edit, BR for Browse, and ER for erase. So the sequence was BR to browse the file, it is the right file, retrieve the command and overtype E for edit - whoops the file has gone as the command was now ER .. When I pointed this out to him he said he wondered why half his files kept going missing every day. I helped him remove his ER synonym, and removed his ability to write to common disks. 1. #### Re: thin fingers, small brain Don't mix one and two character synonyms, too likely to screw up (as you just showed). Edit should have ED as synonym and Erase should get either DL (for DeLete) or RM (for ReMove). 2. #### Re: thin fingers, small brain The issue is still around in the crontab command: crontab -e (edit the crontab) crontab -r (remove the crontab) Luckily, I had previously done crontab -l, so I could reconstruct it with copy and paste from the scrollback buffer. 1. #### Re: thin fingers, small brain I never remove a conrtab, I comment out the entry. Because what was needed once is very likely to be needed in the future, so I better keep it around. 1. #### Re: thin fingers, small brain Or you could keep it under version control. 14. #### I had a narrow escape once I was working on my BSc thesis project at an Italian observatory in Switzerland, testing a new cryogenic IR spectrograph. First order every night was refill the liquid nitrogen and helium reservoirs in the dewar container of the instrument, before heading off to the observing control room. After that we needed to enter the coordinates of the object to observe (right ascension and declination, or RA and DEC) for short. Most of the objects were northern hemisphere objects, and were perfectly safe to observe, but one object was below the equator, and there were limits to which we were allowed to point the telescope downwards towards the horizon, in part due to the mechanics of the instrument, but also due to the fact that we might actually pour all the liquid nitrogen and helium out of the instrument, which wouldn't be a particularly good idea either. The engineer who built the instrument had worked out that this object, at DEC = -6 degrees plus a bit was just about at the limit of the specs of the instrument, but it should be safe. I duly entered the coordinates, and the system replied with the coordinates entered, with the sensible question "Is this OK?" Until that fateful southern hemisphere object, I had always checked; found the data to be correct, and entered "Y", to which the system responded with a cheery "Then I go!", and pointed the scope at the desired object. This time I noticed I had accidentally entered DEC= -16 degrees plus a bit. Therefore I entered "N", and was horrified to get the response "Then I go!". I turned to one of the Italians on duty to ask how the hell I could stop this, and why the hell the program had ignored my "N". He replid that the system would accept essentially all input as "Yes", except Ctrl-D. The only option to stop it after the "Then I go!" was to enter new coordinates and press "Y" (or any key that chose your fancy). We rushed upstairs to the telescope, but found that the extra 10 degree rotation hadn't damaged anything, but I felt rather shaken that I might have trashed a several million guilder (at the time) instrument, let alone several years work, all because of, let us say, substandard UI design. 1. #### Re: I had a narrow escape once ALWAYS mount a scratch monkey! (programming errors are MUCH more impactful in embedded control systems) 2. #### Re: I had a narrow escape once Good story! Ah, yes, the UI convention of "Y means Yes, N also means Yes, and a gunshot wound to the thigh means No". ...later rewritten as a result of the "No means No" campaign. 1. #### Re: I had a narrow escape once With the recent example of 'Do you want to upgrade to Windows 10?" showing that this is still the convention... 3. #### Re: I had a narrow escape once At work we've got that build system used to build the delivery artifact, like signed msi files which will be delivered to the clients. It's usually a long process, which need to be supervised since it's notoriously flaky. And then the build process will ask you this 'File over 250 Mb, do you want to discard [yes/no]': I think that every team member got burned at least once by answering yes, losing the build result and having to restart. 15. #### A hyphen is not a minus sign. When will we learn. Support for UFT-8 would have allowed the user to express their intent more precisely and this nonsense wouldn’t have happened. AppleScript gets it right. <\troll> 1. #### Re: A hyphen is not a minus sign. UTF-8, surely?! Damned sausage finger mistypes... If only there were a way to save us from ourselves?? ;-) 1. #### Re: A hyphen is not a minus sign. This is known as Muphry's Law (with that exact spelling) in some fileds. 2. #### Re: A hyphen is not a minus sign. <\troll> What does &lt; tab "roll" &gt; do? Is it some kind of AppleScript thing? 16. Been on the receiving end of a few such calls. One was someone ringing the support line to say their database wasn't up. I asked them to check the database directory - it was empty. They had seen the disk was a bit full, saw some large files, and did an rm -rf. Poor lady was almost crying down the phone - no backups for 4 days. The other was someone who managed to do an rm -rf from '/' but not as root. Not that it matters, because that can bork a machine just as much as completely deleting everything. Worse still, he did it twice! The first time we left the server (Linux) still running - we didn't dare turn it off or reboot it until we could work out what we were going to do. The second time.... :) 1. #### Back in the day... I do recall a colleague (who, shall we say, was not the most technically inclined of people) getting very confused about the fact that their code was causing a database table to mysteriously empty - after each run, he couldn't find any of his test data in there. Thankfully, this was just in the dev environment, so we could restore his system back to a known good state. Much head-scratching later, it transpired that all the records were still in the table. However, he'd accidentally terminated his SQL command early, so instead of running "UPDATE x SET y WHERE z", he'd just ran "UPDATE x SET y". The result? a few thousand records, all with the same ID.... I've seen it happen since, sometimes with far more of an impact (hello, production system!). At least this first lesson in the importance of sanity-checking and transactional integrity was a humorous one! 17. #### booting single user "The answer was not good: "Even a reboot (IPL in those days) would not fix the problem." The command had been recorded and would resume as soon the computer was restarted. There was no going back." I suppose this kind of situation is why the Gods of Unix have implemented the concept of single users boot ... In this case, booting single with no batch system would have allowed emergency careful maintenance of the batch system ... But I indeed don't think this is possible on iSeries ... 1. #### Re: booting single user Single boot on iSeries is possible (with the appropriate login credentials at the terminal) but would not have been necessary in this case as an IPL terminates all jobs. Of course there are autostart jobs which may do a restart of some processing, but those can be killed by any qualified (and authorised) operator. 18. #### I love QNAP I was called out to a remote site / 4 hour drive from HQ. Super secret stuff being done on said site and no remote access etc. After faffing about because my security clearance for the building and data being processed was OK but it was inside a building site and I had not done an induction I spent the night in a (rather shitty) hotel. 4 hour induction the following morning to walk 100 meters along a road with an escort, total overkill. The escort had instructions to stay with me while on site but he didn't have clearance to enter the building so he had to wait outside in the rain. Anyway the issue was that someone stuck a pen in the reset button on the QNAP NAS because they 'couldn't log in to the Web interface and wanted to reset the password'. I just thank the gods that it had extrernal USB HDD backup and backup to one of the several client PC's. No data lost and the reset button was disabled. I logged it under hardware failure and recommended that all pointy objects were confiscated from our offices just in case. 1. #### Re: I love QNAP "I logged it under hardware failure" This is the sort of situation where callouts should be charged to the department at fault with a full explanation of the reason given. 19. #### JES2 purge commands by (cheap) useless idiot. About 3 or 4 years ago millions of lines of current output disappeared off our JES2 spool (the commands in the article are JES2 purge commands) - upon investigation I tracked the command to an "operator" in a certain sub-continent (the staff who replaced our really good "expensive" operators) I asked what he was doing - apparently he was "practising his JES2 commands" - on a live system operated by a multi billion pound organisation. In particular he was practising the age related purge - he was teaching himself how to purge everything over 2 days old but he clearly couldn't understand the difference between <2 and >2 - that said - what the hell was he doing taking it on himself to purge everything over 2 days old? - We keep 120 days worth of output on the spool! (obviously crucial stuff is spooled off to an archiver) and the spool was only 45% full. He was just playing. Of course, excuses were made and this particular idiot got off scot-free and is still at large, meanwhile many excellent experienced operators sit on the dole or are in forced-retirement (two suicides I know of by dumped 50 somethings who are too old to retrain and can't support their household on minimum wage jobs.) It's only a matter of time before one of these doing-the-needful idiots brings a blue-chip company to its knees. When it happens, will the fat cats hand back their bonuses for "saving money"? Answers on the back of a P45... 1. #### Re: JES2 purge commands by (cheap) useless idiot. "Of course, excuses were made and this particular idiot got off scot-free and is still at large" Consider yourself lucky if he ever confessed his sins. I've worked with off-shore countries, where, in similar cases, even being confronted a formal proof, most of the staff would vigorously deny all and would get away with it safely (HR being their country's HR). I may even send a story about that for on call. 20. #### "£PJ12345--12349" I don't know if it was deliberate or not, but on one or my browsers, on one of my machines, that line was actually hyphenated at the hyphen: ~ ~ ~"£PJ12345- -12349" ~ ~ ~ ...which made it even more difficult to understand. 21. "....HASP had been told to purge jobs 12345 to -12349, and was determined to do so." I lol'd at that "...most of the rage was directed at Big Blue, "who had allowed a simple mis-type… to have such big consequences". I lol'd even more at that! Big Badda B00M! ---> 22. At the same way-up-in-northern-Canada data centre where we once blew up and /roasted a ham hock with a 400 kilovolt jolt of electricity, we also ran an obscure operating system for a financial services server system that was a combination of DOS and *nix that had an INTERPRETED scripting language based upon BASIC which we used to perform numerous "Delete Old Batch Job Tasks" on an automated basis. Normally, one should delete a job via it's actual STRING NAME and NOT it's process identifier which was a number from 1025 to 65535 which was re-used when the process IDs eventually wrapped around. AND normally after you delete a job, you put in a pause or delay command giving you enough time time to do a Control-S (Pause) or Control-C (break with prompt) if necessary to break out of a script. UNFORTUNATELY, back in the day, this particular OS was prone to race conditions where a SINGLE batch job could take over an entire system, process ID's got re-used and it would never give up control unless a hard electrical shutdown was completed AND the offending script was deleted and/or the hard drive it was in was REMOVED from the storage system. SO IN SHORT ..... a QUARTER BILLION DOLLARS ($250 Million US) of financial transactions went to hell for 10 hours because of some batch code much like this:

10: REM Set Process Delay time

20: for i=1 to 2

30: REM loop through and Set job id number

40: for x=1025 to 65536

50: deleteJob( x )

60: REM: Delay in seconds to allow enough reconciliation data flush-to-disk time before deleting next job

70: pause( i )

80: Next

90: Next

100: goto 10

Because instead of a pause being 1 to 20 seconds or even milliseconds, on THIS SYSTEM the pause command used CLOCK TICKS as a function parameter and the outer for next loop was SUPPOSED to be a simple ASSIGNMENT statement rather than the for-next loop that was simply cut and pasted over from another batch job script file and someone FORGOT that one second delay required a command like: Pause ( 1000000 ) !!! AND there was no final break out of the code, so it looped forever!

Now imagine 64000+ financial stock trade transaction reconciliations being deleted in a race condition AND looped around again and again WITH NO WAY to interrupt it because NO OTHER keyboard or other input (i.e. Control-C break or pause command) could be interpreted because the time delay was in clock ticks (i.e. waaaay too short of a delay) AND the priority of the task was set at such a high level the system became bogged down and even a hardware reset would still cause a RELOAD of the script and run it at OS System-level priority and continue and continue forever!

Because the delay between job deletes was way too short, and the set-process-ID job code looped forever, the data flush-to-disk functions were never able to be called on the internal account reconciliation code so MANY hundreds of millions of dollars of trades were never properly recorded and sent to New York, London, HK and Toronto for payout. Because of the margins on such trades were time sensitive, any delay in account reconciliation is profit destroying!

In the end, we had to actually CUT (not unplug!) the ribbon cable from the system to the offending hard drive to get a proper system shutdown and go get an offsite SYSTEM OS IMAGE BACKUP which was a 4 hour drive away into town from a remote wilderness data centre location, get over to a locked-up safe to retrieve the second system OS disk, drive all the way back another 4 hours, re-install and turn the systems back on to re-run the job with the batch code fixed properly to then find out later the next morning that ten hour delay in reconciliations cost over a Quarter Billion US dollars to the companies involved.

I should note that NOBODY got fired in the end, even after a thorough investigation by external parties because it was deemed a HARDWARE and OPERATING SYSTEM failure point rather than the "System Operator Error" it actually was!

Still, to be part of a team that was kinda the CAUSE of a Quarter of a Billion U.S. Dollars of someone else's money being flushed down the toilet IS STILL AWE INSPIRING !!!!

.

23. Solaris "killall" and linux "killall" behave in vastly different ways. Solaris takes the meaning of the command quite literally. Found out the hard way.

1. Try killall5 on Linux. Or just read the man page.

2. I used to have a nifty little task management utility in Windows. It was also potentially dangerous, as I found out one day when I accidentally used it to kill winlogon.exe on my own NT box, and watched as the machine bluescreened without saving the document I'd been working on.

I've learned from that, and now have a machine dedicated to testing. It's handy to have a machine I can wipe without worrying about lost data.

24. Way back in the early 90s, when I'd just left school, I decided to follow in my father's footsteps and be a Freight Forwarding clerk.. The company I worked for was a small company, and, TBH, it was a boring job.

HMRC had just switched to a then state of the art electronic submission system for paperwork that required we use specially designed software on the one IBM AT in the office. Unfortunately, as we had no one with real IT experience working for the company (I was a keen hobbyist, but I had only been out of school a couple of years), we had a backup system, but it had not been tested, probably ever.

Every friday, my boss cleared old jobs off the system to free up space. The software required the user to enter the record number manually, but could work on a range of records. So, you could enter the start record as 12000, and the end record as 12010. My boss, one day, entered the start and end the wrong way round. The software didn't check for that, and happily deleted over a year's worth of records. HMRC required we keep the last 5 years of data on the system. They had dial in access to the computer, and would check it regularly.

So, my boss noticed all these records had gone missing, and, knowing I was a keen computer hobbyist, asked me to fix it. I tried the backup, and it didn't work. He phoned the supplier, and spent a couple of hours on the phone shouting at some poor tech support guy on the other end of the phone. Nothing they suggested worked.

The final solution? I had to go through every single shipment we had exported in the year that was lost, and re-enter the data from our paper files. It took about 2 months. Thankfully, for me, they made me redundant (although sadly I hadn't been there long enough for a payout), and the company itself didn't last long after I left. Obiously, I am not really thankful for that. People I liked lost their jobs, and that is never good. It was good for me, because while I had various other jobs that I didn't really enjoy and didn't last long, it ultimately got me thinking, and persuaded me to go to University (which is something I was resisting), and enabled me to start the career in tech support I have, which is something I enjoy.

25. #### Database woes...

A long(ish) time ago, in the late 90s/early 00s I was working for a company where I had spent a fair few years doing coding on a PC based Retail Point of Sale system.

We also provided servicedesk services for a variety of retail chains in the UK and used a fairly low-to-mid-range system, called Heat, to log calls for all the different customers.

I was given the task of seeing if we could make it multi-tenant so that calls could be easily logged and managed for each customer using categories, SLAs, and stuff like that tailored for each customer and to make it easy to switch between different customers within the system when taking calls.

After a bit of time playing with the out of the box customisation options, which were mostly around tweaking fields on forms and setting-up system wide categories,SLAs,escalation routes, I ended-up implementing a whole load of functionality using triggers and stored procedures on the back end database, which was SQL Server 7.0.

I had never even clapped eyes on SQL Server before this - The PoS system I had worked on used a flat file back end database and I had no SQL Server training (of course!) so I pretty much learned on the hoof and had no support from the people who had originally installed the system.

To cut a long story short, I accidentally truncated the core call reference table that everything else hung off pretty much at the busiest time of the day! There were no Pk/Fk relationships within the database to stop that happening - all the relationships were handled within the software itself.

I couldn't just restore the database without kicking everyone out of the system and causing a fair amount of TITSUP* so, after about 30 seconds of wild panic, I ended up restoring to another database everything up to the last transaction log backup, which I had thankfully enabled a couple of weeks previously after discovering the woeful backup strategy that had been left by the original installers, and copied the restored data back over to the live database.

All done in 15 minutes flat! Hardly anyone even noticed there had been a problem and I found a dark corner to go and recover my sanity!

Needless to say, I was much more careful about using the truncate command after that....

On the plus side, I moved to a better paid DBA job in the finance sector a year or so later off the back of that project!

* Total Inability To Support Usual PoSProblems

1. #### Re: Database woes...

Why, when I know it stands for "Point of Sale" do I read "PoS" as "Piece of Shit"?

Anyhoo, as I've said before, we used to maintain our own system that listed all the equipment we had. We designed the various interfaces (web and Java) and backend service, and database.

All of a sudden one day, the system refused to do anything.. We couldn't loan out equipment, check it back in, run any reports or maintain the inventory. Everything just generated errors. My colleague took the system offline, and started going through the logs. Apparently, the system's central transactions table (which it used as an audit trail, as well as to store what was checked out to users) had vanished.

Rather worried, my colleague load up Sql Server Management Studio and started lookign for the table. Somehow, the table had been renamed to a full stop. Now, there were four people who had access to that database at the time. Me, my two colleagues who help design the system (all three of us designed different aspects) and the department DBA. Everyone denied renaming the transaction table, but I suspect my other colleague. I didn't do it (and had been busy elsewhere all morning, so had an alibi). The DBA wouldn't have done it, because he had no reason to be doing anything other than looking at the table. My colleague who fixed it would have been the one called to fix it, so I doubt he'd be looking to do something that generates work for him, which leaves my other colleague.

26. #### In many languages, --1 == +1

With Python (and probably quite a few other languages, '--' means '+':

$python3 >>> 1 -- 1 2$ tex \\relax

This is TeX, Version 3.14159265 (TeX Live 2019) (preloaded format=tex)

*\count0 = --1

*\showthe\count0

> 1.

## POST COMMENT House rules

Not a member of The Register? Create a new account here.

• ### Add an icon

Anonymous cowards cannot choose their icon

Biting the hand that feeds IT © 1998–2021