back to article Config cockup leaves Reg reader reaching for the phone

Facebook went down and Twitch flashed its privates last week thanks to alleged config cockups. However, who among us has not suffered the stomach-dropping fear that follows the ill-advised submission of a seemingly innocuous command? Welcome back to Who, Me?, The Reg's weekly column that recounts personal tales of catastrophe …

  1. tip pc Silver badge

    Cisco commands are live too

    The SSR differed from Cisco hardware in a number of ways. Perhaps the most important was how configuration worked. "When changes were applied to the command line they were applied directly," explained Kildare, "rather than being held in a buffer and applied with a second command."

    Cisco commands go live once entered.

    A good routine is build your config in a text editor

    Then save the running config on the device and a copy on your pc

    Then reload in [x minutes]

    Then paste your commands

    If you make a mistake and lock yourself out then it’ll reboot and get you back to where you where before you started.

    An out of band network (dial up, mobile or xdsl) is invaluable when working on remote systems.

    We all (ok most of us) learn from our squeaky moments!!

    1. Anonymous Coward
      Anonymous Coward

      Re: Cisco commands are live too

      Yep - Cisco IOS, CatOS etc were all live for configuration.

      Perhaps the author has only used IOS XR on Cisco devices which like JunOS is entered and the committed later giving you a chance to check things before you do.

      Forgetting to use commit confirmed on JunOS or IOS XR can still lead to the same embarrassment with locked out devices etc. With the added trick that one OS you put the delay before reverting in seconds the other in minutes, commit confirmed 300 - what do you mean it will revert in 3 hours!

      Memories of an issue with a commit on JunOS and the ensuing BGP chaos are probably one for my own Who Me (no this didn't relate to Facebook).

      1. PerlyKing Silver badge
        Coat

        Re: commit confirmed 300 - what do you mean it will revert in 3 hours!

        These will be the new decimal hours then?

    2. The Oncoming Scorn Silver badge
      Pint

      Re: Cisco commands are live too

      Had a series of machines that needed to flush dns, reset winsock & release the ip address, then reboot.

      I had to drop a created batch file on the remote machine & execute via PSExec, having learned via a hangover (Icon) that morning, that the moment I released the IP, the machine was off the network.

      1. emfiliane

        Re: Cisco commands are live too

        Oof. I did something similar, and not even remotely, it was a script to disable and re-enable the network interface, because AT&T's USB dongle is crap and would regularly stop passing traffic until it was reinitialized. At the critical line, I put "$d | disable & enable". But while it worked on mine, somehow, they did not like being chained at all on the remote system. And it turned out that shut it off permanently, too.

        Onsite IT was just a guy who knew how to reboot a PC, so, cue up a 150 mile drive each way, all for want of a "$d | enable."

    3. Nick Ryan Silver badge

      Re: Cisco commands are live too

      The scenarios that can bite a long time after are those that can really be hard to track down as a result.

      For example, the difference between live and stored configuration. Make changes to the live configuration and test to make sure that it works. Then forget to save the running configuration. Nothing bad happens until an unspecified period of time later and the system is restarted for whatever reason and it reverts back to the previous, stored configuration and things stop working.

  2. Roger Kynaston Silver badge
    Facepalm

    Sun kit

    newfs /dev/c1t0d0s4

    But I actually typed

    newfs /dev/c0t0d0s4

    Cue the /usr slice disappearing and the server keeling over in slow motion.

    Of course, I was being a twit with dividing the os into so may partitions when it didn't need it as well but I thought I was being clever.

    1. Bruce Ordway

      Re: Sun kit

      "but I actually typed.... "

      "....learn(ed) to re-read everything VERY carefully before hitting the enter key"

      Yes, my first cockup was many years ago and made a big impression on me.

      Enough that I am will slow "way down" and re-read commands multiple times before hitting that enter key.

      I should probably review of my emails/comments more before committing too.

      As I'm routinely surprised by the differences between what I wrote and what I intended.

      1. Mark 85 Silver badge

        Re: Sun kit

        As I'm routinely surprised by the differences between what I wrote and what I intended.

        I think that's normal. Happens to everyone whether writing commands, school papers, or even comments to El Reg. Fat fingering a keyboard and not catching it is pretty normal. I learned to have someone take a look at any thing in the way of scripts, programs, that involved the OS.

      2. Roger Kynaston Silver badge

        Re: Sun kit

        Too true. On rereading this I see that what I wrote here would not have worked anyway.

        /dev/c1t0d0s4 would never have done anything

        /dev/rdsk/c1t0d0s4 on the other hand ...

        What will it be like when we have a direct interface between our brain and the machine? A random quick thought (sod this for a game of monkeys - get rid of the bloody thing) and the shiny new is blown away.

    2. Alan Brown Silver badge

      Re: Sun kit

      "Of course, I was being a twit with dividing the os into so may partitions when it didn't need it as well but I thought I was being clever."

      It was still recommended action long after drives with automagic sector remapping were the norm - everyone had their own magic incantation

      Telling greybeards not to do it would get you treated as if you'd blasphemed

  3. GlenP Silver badge

    Me too...

    This hack's moment of arse-swooping horror involved carelessly highlighting an UPDATE SQL statement and missing the WHERE clause before hitting Ctrl-e.

    Been there, done that, recovered database from backup and crafted a new update statement.

    And that is why we almost never, ever, use UPDATE on the production databases.

    1. LDS Silver badge

      Re: Me too...

      Transactions are your friends.... and never use autocommit.... and always wait a little to type commit <enter> - verify everything is as expected.

      1. W.S.Gosset Silver badge

        Re: Me too...

        Yup, plus:

        Write EVERY predicate as a Select. Only when you're happy with the result, do you twiddle a word or 2 to make it Update or Delete.

        Another good belts&braceses move: after the Select's settled but before you twiddle it to DML, rerun the Select prepended with "CREATE TABLE TMPSafety_timestamp_tblname AS ". Any catastrophe, or even just concerns? You've got a hot backup right there.

        1. W.S.Gosset Silver badge

          Re: Me too...

          (Looking at some other posts below, looks like people are talking about prepping the where clause first, in some sort of gui window.

          To be clear, I mean you really need to actually DO the Select. You need to run that query and you need to grovel that data. You may discover semantic mistakes, you may realise oversight (*forehead slap*), you can even discover bugs in the engine's code (Sybase would intermittently hit queryplan code that threw cartesian products, for example, needing you to alter the order of your query's tablelist). Whatever. But you need to actually physically look at that dataset before moving on to modifying it w/Update or Delete.

          )

    2. dr john

      Re: Me too...

      Missing the WHERE clause - me too...

      Every member of a club could no longer access their online accounts. Apart from Anne whose new data had replace everyone else's login data....

      After the first few had reported the same problem, I had to search through LOTS of old emails from every member to get their correct details back in place. Not a five minute job. And i hadn't activated the database backup option on the web server. Did that very quickly afterwards.

      1. jmch Silver badge
        Thumb Up

        Re: Me too...

        One of the reasons I love RedGate, among the many other nifty SQL shortcuts and aids, is a popup that intercepts UPDATE / DELETE statements without WHERE clauses and asks 'are you sure?'

        1. I Am Spartacus
          Thumb Up

          Upvote for Redgate

          Saved me more than once

      2. Robert Carnegie Silver badge

        Re: Me too...

        Do you currently work for an security training company which gave me a personal account with, as it turns out, a user name of (Female colleague's forename) (My surname)?

        If so, I was told to wait for it to be fixed, so I'd like it to be fixed, thanks.

        I'm suspecting that I have not been given (Female colleague's forename) especially, probably everybody has been named (Female colleague's surname).

    3. Dave Pickles

      Re: Me too...

      ISTR Informix wouldn't allow UPDATE or DELETE without a WHERE clause. Never saw it happen but I can imagine muscle-memory adding WHERE 1=1...

    4. teebie

      Re: Me too...

      Write the where clause first.

      Maybe run a select command with it first.

      Then add the update/delete to the query.

      1. MiguelC Silver badge

        Re: Me too...

        carefully craft the WHERE clause....then select only the UPDATE clause and execute it (I did it once, fortunately caught it and rolled back)

      2. I Am Spartacus

        Re: Me too...

        Well, we all know that is what you SHOULD do.

        But do us BOFHs really do this all the time?

        Let he who is without sin .... you know the rest!

        1. Trygve Henriksen

          Re: Me too...

          Run his COMMIT first...

    5. martinusher Silver badge

      Re: Me too...

      Then there's the classic:-

      $rm -rf *.lst

      I did this back uin th late 1980s on a System V machine. From the top of the file system. I needed to get rid of a whole load of developer listfiles that were clagging up the disk. Seemed like a good idea except that a) I was root and b) I accidentally put a space between the asterisk and the '.lst'. I stopped the command with in a second or so but too late, it had already deleted critical system files. So the system had to be kept running for about six weeks hoping that there would be no power glitch.

      Lesson learned. Both about typos and dong stuff as root.

  4. HkraM
    Facepalm

    I'm waiting for the "Who, Me?" from the tech at a certain now-dead ISP in Finchley who tried to reinitialise the offline RAID array on the mail server, but entered the command on the live RAID array instead.

    1. Solviva
      Devil

      That'd be the ISP whose access numbers most often ended in 666

      1. TonyJ Silver badge

        I took a whispered call from a field engineer in the mid-90's.

        He was holed up in a server room, hiding.

        This was when Compaq's RAID controllers and hot-swappable disks were still fairly new.

        For some reason, he'd been tasked with replacing a disk in a RAID5 array but had had literally zero training in servers in any way, shape, or form.

        He did, however, have some desktop experience, so he booted to his DOS boot floppy and typed FORMAT C:

        Of course, he hit Y because, well... because this is what he'd done on a fair few desktop computers when he'd been tasked with replacing the HDD in it.

        Queue mayhem.

      2. Jusme

        I thought that was plusnet...

        https://www.theregister.com/2006/07/11/plusnet_email_fiasco/

        I'm sure ex demons have a few tales to tell though (floor SWL exceptions and ice lollies spring to mind...)

        Good times, long gone :(

    2. I Am Spartacus

      Finchley fin and games

      Would that be the one where a network admin power cycled a whole cabinet of modems when one was locked, with that infamous comment of "well, what do the users expect for a tenner a month"?

      1. Cederic Silver badge

        Re: Finchley fin and games

        We forgive you. Although it was a tenner a month and a penny a minute on the phone bill.

    3. mark4155
      Trollface

      Come out, come out wherever you are! Make a clean breast of it, uncle Reg will be forgiving and understanding.... unlike the readers. :-)

  5. Jay 2
    Facepalm

    I needed to update the SSH config on many boxes (in the days before ansible) so a quick bit of command line fun and it was all done (change config file, restart sshd for effect). Then I realised I coudn't SSH into any of the servers. Turns out I'd made a typo! I was then punished for this, my myself, with having to go onto the remote console of each server (which thankfully all worked, some iDRAC/iLO can be a bit flakey) and sort out the problem.

    Lessons of the day being don't be so blase about such changes, double check the config, and test on one box first!

    Similar to the article I've also been messing about with iptables before and managed to lock myself out of a server via SSH. Again, to the console for fixing.

  6. Andy 68

    When forgetting 2 simple characters (-r) means a long walk

    Shutdown now <enter>

    Oh bugger

    1. Anonymous Coward Silver badge
      Thumb Up

      Re: When forgetting 2 simple characters (-r) means a long walk

      -r doesn't always mean recursive.

      `rm -r` is more dangerous than `rm`

      `shutdown -r` is safer than `shutdown`

      1. John Brown (no body) Silver badge

        Re: When forgetting 2 simple characters (-r) means a long walk

        "`shutdown -r` is safer than `shutdown`"

        Except when it's a remote box with no remote management so it's either the grovelling phone call to someone on-site or jump in the car and drive there.

    2. Anonymous South African Coward Silver badge

      Re: When forgetting 2 simple characters (-r) means a long walk

      haha, rite of passage.

      Happened to me as well.

    3. dhawkshaw
      Linux

      Re: When forgetting 2 simple characters (-r) means a long walk

      At least nowadays on OpenSUSE we have 'reboot' as well as 'shutdown -r'

      I tend to only use shutdown when I know I really mean it as 'shutdown -h now'

      I'm sure 'reboot' is a common alternative on other distros too.

      But yeah -- definitely as a whippersnapper I tripped myself up with it.

      1. phuzz Silver badge

        Re: When forgetting 2 simple characters (-r) means a long walk

        It's available on Ubuntu too.

    4. dak

      Re: When forgetting 2 simple characters (-r) means a long walk

      Three characters, Shirley?

  7. Marco van de Voort

    A student club's IT was a mix of eclectic architectures, one older than the other (and they prided them on the heterogenous old nature).

    Anyway, as favor I did a netbsd crossbuild from PPC(*)->to->m68k, and then ran crossinstall to prepare a dir to prepare the dist for moving to the 68k.

    Unfortunately I forgot the option to set an alternative root dir, so it installed over my PPC distro. Needless to say, the commandline experience after that was "interesting" as the PPC tried to execute m68k binaries

  8. WanderingHaggis
    Facepalm

    Been there, done that, where did I put the tee shirt?

    Updated a FortiGate firewall in the data centre and watched it disappear. Mad rush to centre (2 hours away by car) to reboot it. I then implemented a script that reboots the firewall 5 minutes after a configuration is changed unless you save the running config. (I strongly recommend this to any budding firewall manager.) I you can save it your good if you can't you need the reboot to fall back. But remember don't get distracted after you made the changes as going for coffee or chatting to a colleague has resulted in many a reboot.

    1. Anonymous South African Coward Silver badge

      Re: Been there, done that, where did I put the tee shirt?

      Reminds me of the New Microsoft Frog 2000.

      It goes Reboot! Reboot! Reboot!

    2. Anonymous Coward
      Anonymous Coward

      Re: Been there, done that, where did I put the tee shirt?

      "But remember don't get distracted after you made the changes as going for coffee or chatting to a colleague has resulted in many a reboot."

      For network kit, you only deserve a reboot or worst case, a trip to the DC for serial access.

      But breaking the rule of "no distraction" can be a nightmare for DB and generally any type of data changes.

      In a big company, my boss, the DBA, after a day with no backup, decided to reorg the prod Oracle DB during the evening (reason there were no daily backup yet).

      He was moving also datafiles to new file systems.

      He had 2 xterms open: one for the test server and one for the LIVE one, and running commands on test, verify and then move those to the LIVE.

      An excited colleague from the DEV team entered and demanded some new test FS, ran an excited conversation etc ....

      Then, after the storm, he resumed and ... ran the wrong cmd on LIVE.

      Then spent the whole night going to any Oracle forum to see if there was any solution except data loss. Nothing apart from one day ago restore.

      But since he was good, he managed to recover all data files !

    3. Boothy Silver badge

      Re: Been there, done that, where did I put the tee shirt?

      Quote: "But remember don't get distracted after you made the changes as going for coffee or chatting to a colleague has resulted in many a reboot."

      Many years ago, one of my colleagues acquired stole while drunk more likely one of those amber flashing beacons on a tall plastic pole, the type you'd get on road works.

      If someone was in 'don't disturb me mode', such as prod changes, the beacon would be placed next to their desk and turned on.

      Everyone knew to keep clear, and the only exception allowed was to bring more coffee!

      1. DonM.

        Re: Been there, done that, where did I put the tee shirt?

        Used to work with a developer who would put on a hardhat to signify 'do not disturb ' mode.

    4. Anonymous Coward
      Anonymous Coward

      Re: Been there, done that, where did I put the tee shirt?

      I'm guilty of this. I'll make a change on a Juniper device, issue a commit confirmed, log into some remote server to verify that the change I just committed actually worked, only to forget to return to the Juniper device to prevent the rollback from occurring...only to be asked why the change that was marked completed last night doesn't appear to have resolved the connectivity issue...

  9. Terry 6 Silver badge

    The old days

    Back in those days when most organisations, let alone schools, relied on the techie amateur to get and keep kit working I was relatively well trained. Both through my own school days, proper courses and a bit of apprenticeship with someone a bit more advanced than I ( though equally an amateur - just he'd had his training from the educational computer company and assorted real experts).

    But high on the list of my training were the items;

    1) Don't change anything until you have a copy of the original in place, safely ( floppy disc in those days)

    2) Have a written copy of any changes you need to make. (handwritten ideally, that does make a difference in error avoidance somehow).

    3) Compare what's on the screen with the written copy

    4) Even then. Don't press {enter} until you've read it through- who knows what might have slipped through that you hadn't noticed, or wasn't in your notes

    5) Pray

    And that ladles and jellyspoons was for very simple (in those days) school computers. There was little risk of losing more than a few files, because we didn't have anything very complicated to screw up.Though I guess that made 2) easier.

    1. Anonymous South African Coward Silver badge

      Re: The old days

      And get a second pair of eyeballs. Never underestimate the power of a second pair of eyeballs.

      Friend of mine had issues with his firewall, just would not work. So I shuftie'd over to this place, and spotted the problem - a typo in his firewall script. Once the typo was changed, firewall worked as advertised.

      1. Anonymous Custard Silver badge
        Headmaster

        Re: The old days

        Personally I always preferred type-written for item 2 than hand written.

        But that says more about my handwriting legibility than anything else, and could easily introduce another source of unwanted variance and typo's into the mix.

    2. Mark 85 Silver badge

      Re: The old days

      I think that one of the rites of passage for any IT type is to fat finger the keyboard once. Usually, the boss will accept that but not the second time it's done.

      1. Hazmoid

        Re: The old days

        having been involved with the IT side of stockmarkets when a company was completely bankrupted because of an operator fat-finger, it was rare that there was not at least a verbal confirmation that what the operator was about to enter into SEATS was correct.

  10. ColinPa

    e er what's the difference

    I had a colleague who set up synonym line commands

    e for edit

    er for erase

    br for browse

    Step 1 browse the file

    Step 2 recall the command and overtype with e

    Step 3 wonder where the file has gone.

    When I pointed out he had typed er filename he said "thank you.. I often wondered why my files disappeared"

    1. J.G.Harston Silver badge

      Re: e er what's the difference

      If you're going for command aliases, surely you'd make them unambiguious.

      e for edit

      d for delete

      b for browse

      1. PerlyKing Silver badge

        Re: e er what's the difference

        Personally I don't create short aliases for destructive commands. Too risky around my fat fingers!

        1. stiine Silver badge

          Re: e er what's the difference

          All of the really destructive commands are 2 letters, just like all of the benign commands... or they used to be...

          1. PerlyKing Silver badge

            Re: e er what's the difference

            I'm thinking about third-party tools like supervisord. I'll alias "supervisorctl status" but not "supervisorctl stop".

        2. Auntie Dix

          Re: e er what's the difference

          This is why I have aliases for cp, mv, and rm that add -i (ask first) to the command line.

  11. DJV Silver badge

    "Kildare"

    In a hospital?

    I see what you did there...

    1. chivo243 Silver badge
      Thumb Up

      Re: "Kildare"

      +1

      I wanted to have verification of this. The name sounded familiar, I was far too busy today for the hospital coin to drop.

  12. Anonymous South African Coward Silver badge
    Mushroom

    Colleague of mine (if you recognize your story, say hi!) went out to site to perform an upgrade to an Informix database.

    Said database was a "warehouse" DB, meaning it collected information from a couple of other Informix DB's for transactional purposes, and passed on transactional history from the one site on to the other.

    Colleague mistakenly entered oninit -iy and Informix promptly re-initialized the master DB. Whoops. He was not happy.

    Took support a couple of days to restore the DB and get things synchronized again.

  13. TeeCee Gold badge

    Fat fingers.

    I found that typing a ">" when I meant to put in a "¦" can really bugger up a day...

    1. ActionBeard

      Re: Fat fingers.

      Given how many times I've seen files called "more" or "pg" over the years, you're definitely not the only one.

      I got caught out many years by typing "last | reboot" instead of "last reboot". I wondered for a moment why it was taking so long to get back to me....

  14. longtimeReader

    Just last week ...

    I regularly run "rpm -qa | grep [product prefix] | xargs rpm -e" to uninstall development level filesets of my packages, ready to do a clean install of the next build.

    I'm sure there are better ways, but that's the one my fingers have learned. Except last week, they forgot to type the middle clause and various messages made me realise that almost every package installed on the image was being deleted. Killing the job didn't really help as by that point too much had gone.

    1. Anonymous Coward
      Anonymous Coward

      Re: Just last week ...

      There is a script to bring down the system and requires the hostname as a parameter to match the machine's hostname as confirmation, so clever cloggs commits the following keypresses to muscle memory:

      bring_down_system.sh `hostname`

      My shortcut worked very well until I accidently typed it into an CAT environment terminal window instead of one of the tens of development environment terminal windows.

      I unlearned that pretty quick.

  15. Zarno Silver badge
    Mushroom

    Disk Destroyer

    One Halloween, I was doing a HDD->SSD conversion on my personal laptop.

    While manually copying the cranky windows7 partition over inside the clonezilla shell (can't remember exactly why it was cranky...): dd if=/dev/sdd1 of=/dev/sda

    Ctrl+c in an ohnosecond, but damage was done.

    Yep. Double trouble. Blew out the MBR of the HDD with the allocated but blank partition on the SSD.

    Only real loss was having to reinstall windows and some programs (data recovery and backups, woo...), and never getting the blasted fingerprint reader to work in windows because the OEM that made the reader thought it was LOVELY to have the actual drivers the installer needed be on a server that no longer existed after they went under.

  16. chivo243 Silver badge
    Headmaster

    write mem?

    Whaddya mean the config we wrote yesterday is gone?

  17. cdegroot

    Nothing new...

    cd /tmp

    rm -rf *

    I always typed that on the machine after completing an installation or upgrade on-site (I worked for a company making mid-office software for a bank and we supplied the stuff including a server and a tape drive for backups - all they needed to do on-site was swap tapes according to the schedule we gate them). Of course, back in the day, your Unix shell (this was Xenix) would just respond with

    #

    Nothing like the modern shells for wussies that will tell you where in the file system you are - I still consider that cheating. Anyway, I lied about what I typed. What I actually typed was

    cd / tmp

    rm -rf *

    Helpfully, the change dir command ignored the extra argument causing me to wipe the whole box. By the time I realized what I done, it was too late - the box was firmly hosed and my day suddenly became a lot longer reinstalling Xenix, our database package, our systems software, and replaying the tape that, luckily, I used to make a backup of the customer's data just before I started the upgrade. Yes, that "please make a backup of your data before commencing this upgrade prompt" you always ignore? I learned pretty early in my career the value of not doing that :)

    1. Dave Pickles

      Re: Nothing new...

      'Yes, that "please make a backup of your data before commencing this upgrade prompt" you always ignore? I learned pretty early in my career the value of not doing that :)'

      When running an upgrade on VMS one of the prompts read:

      "Are you satisfied with the backup of your system disk?"

      Somehow those words always had the desired effect.

  18. Ochib

    The Onosecond

    https://youtu.be/X6NJkWbM1xk

  19. Ken Hagan Gold badge

    All the lines between...?

    Had the software naively accepted the input, it might have deleted nothing (since the start is already past the end) or all the lines *except* (signed/unsigned confusion and integer wrap around) but *between* implies that whoever wrote this code took the trouble to spot the error ... and then quietly executed it anyway.

    There's a special place in hell for those folks, managed by a demon with the BOFH cattle-prod on its "supernatural" setting.

    1. A.P. Veening Silver badge

      Re: All the lines between...?

      Had the software naively accepted the input, it might have deleted nothing (since the start is already past the end) or all the lines *except* (signed/unsigned confusion and integer wrap around) but *between* implies that whoever wrote this code took the trouble to spot the error ... and then quietly executed it anyway.

      It's not a bug, it's a feature.

      There's a special place in hell for those folks, managed by a demon with the BOFH cattle-prod on its "supernatural" setting.

      AMEN!

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Biting the hand that feeds IT © 1998–2021