back to article Keeping your head as an entire database goes pear-shaped

A reminder of the devastation a simple DROP can do and that backups truly are a DBA's best friend in this morning's "there but for the grace of..." Who, Me? "Stephen" is the author of today's confession and was faced with what should have been a simple case of applying an update to an Estimating and Invoicing system. The …

  1. NJS

    I'm not a DBA...

    but even I got to the line that started DROP and I was squirming.

    1. anothercynic Silver badge

      Re: I'm not a DBA...

      Same here... "Drop the database" elicited an "oh no... surely you meant *stop*" from me.

      Thankfully, we keep the raw logs and transactional stuff as a backup for longer than the actual database, so we can roll through a year's worth of stuff to recover, but I don't ever try the 'DROP' keyword, ever.

    2. heyrick Silver badge

      Re: I'm not a DBA...

      Yeah, I know diddly squat about SQL, but I got the the DROP and was like "isn't that the part of little Bobbie Tables' name that caused all the chaos?".

      Yes, it was. Uh oh.

    3. Plest Silver badge
      Gimp

      Re: I'm not a DBA...

      Oh yes!

      As a 25 year Oracle DBA veteran seeing the words "drop", "truncate" and even "alter" are enough to automatically take my hands 6" away from the keyboard. I've made my share of "blood draining from face" type screw ups over the years that deserve the icon shown, I seem to have trained myself to instinctively take hands-off keyboard at certain words or situations.

      I did assume "drop the database" meant down the service, not literally DBA speak "drop database...", but there you go, the English language is a funny beast!

      1. el_oscuro
        WTF?

        Re: I'm not a DBA...

        Oracle has you covered:

        RMAN> drop database including backups;

        I have no idea why Oracle would even consider that command to be something that could be issued.

        1. Anonymous Coward
          Anonymous Coward

          Re: I'm not a DBA...

          > I have no idea why Oracle would even consider that command to be something that could be issued.

          so that you get to pay for more Oracle support?

    4. Michael H.F. Wilkinson Silver badge

      Re: I'm not a DBA...

      No database expert whatsoever, but even I have gathered DROP can drop the unwary in it, in a very, very serious way.

  2. RockBurner

    I'd say this is an object lesson for companies to ensure that they hire staff specifically for the purpose of controlling, maintaining and managing their hardware, software and data.

    An Administrator of Systems if you will.

    (Yes, I've been caught in the "you're technical, you can handle this" catch22 before now. Never again, thanks).

    1. Arthur the cat Silver badge
      Devil

      I've said it before and will probably be saying it on my death bed, but many (most?) companies only notice system admins when they get something wrong. If you're doing what you should do all you get is "just what do you do all day?" (or even worse "why do we pay your salary?").

      What is needed is some way to ensure a critical system "goes catastrophically wrong" every now and then, only to be fixed (after a suitable delay) by the heroic sysadmin who saves the day, without the users ever catching on to what's really happening.

      1. EvilGardenGnome
        Devil

        Top manager: "Who was that?"

        Mid manager: "Don't know, but they're our best Sys Admin."

        TM: "How do you know that?"

        MM: "Because I haven't the foggiest who they are."

      2. Robert Moore

        I am sure you could script something up. Just bring the critical system down just long enough for people to start coming to you, then shortly after you say "I'm on it." it comes right back up.

        1. Admiral Grace Hopper

          green.bat

          Which was the script that turned the dashboard to All Green when management were on the floor. Reality was restored as soon as they left.

      3. FozzyBear

        "Why do we pay your Salary?"

        Been there done that. My standard reply nowadays is

        The fact that you need to ask that question means I'm doing a bloody good job here.

      4. Montreal Sean

        Powering off a switch was always a good one.

        1. Anonymous Coward
          Anonymous Coward

          Powering off a switch

          <pedant>

          Only if it's a motorised one :)

          </pedant>

          1. jake Silver badge

            I have switches that can be powered on and off, depending on whether or not I want them lit up.

      5. Dave314159ggggdffsdds Silver badge

        "What is needed is some way to ensure a critical system "goes catastrophically wrong" every now and then, only to be fixed (after a suitable delay) by the heroic sysadmin who saves the day, without the users ever catching on to what's really happening."

        Yes. A market niche Microsoft have been taking care of for us for a few decades now...

      6. DadeMurphy

        The most ironic aspect is the existence of that one admin we all know. That one who cleverly architects catastrophes such that they never appear to be at fault but somehow are always swooping in at the critical hour to orchestrate a heroic fix.

        Meanwhile, behind the scenes, the best people design and administrate systems such that the fire alarm fire never happens in the first place. And they never get credit because nobody knows that they're doing God's work day in and day out to keep everything nice and boring.

  3. Binraider Silver badge

    DROP...

    Deltree *.*

    Kill "C:\" (a VBA classic)

    rm -rf

    :(){:|:&};: (please dont use this on something that you don't regard as expendable / rebootable).

    Amongst another commands that exist for legitimate reasons but very, very easy to misuse. Deltree was particularly vicious, seen as it doesn't operate from the currently selected directory, but rather from the drive selection.

    1. Killfalcon Silver badge

      What's the deal with ":(){:|:&};:" ? Is something missing, or this a really strange parser thing in a command line?

      1. Dave559 Silver badge

        It looks strange (it certainly confused the hell out of me the first time I saw it mentioned), but it is basically an obfuscated fork bomb (the explanation on Wikipedia of what it actually does makes it much clearer).

        1. BenDwire Silver badge
          Pint

          Thanks for the link - I've learned something new today!

        2. Killfalcon Silver badge
          Pint

          Cheers, consider me Enlightened.

      2. Anonymous Coward
        Anonymous Coward

        something about a wide-mouthed frog mounting a toad mounting a frog eating a worm mounting a horned toad mounting...

        1. Claptrap314 Silver badge

          a wide-mouthed frog mounting a toad mounting a frog eating a worm mounting a horned toad mounting...

      3. JulieM Silver badge

        It looks like Bash code that defines a function to start lots of process, each doing nothing except start two more processes like it and feed the output of one into the input of the other, until the machine runs out of something.

      4. cosmodrome
        Mushroom

        recursive fork bomb

        Actually it might be better readable like this:

        :()

        {

        :|: &

        };

        :

        The mean joke is declaring a function named ":". The rest is rather self explaining. Call ":" from within ":" pipe it through ":", fork all created instances of ":". I'm not completely sure if the function will ever return but whenever it does, each thread will call all the bloody mess again.

        Not much use tracing it because the machine will lock up completely as soon as you hit RETURN.

        1. MacroRodent
          Headmaster

          Re: recursive fork bomb

          It creates processes, not threads (which is even worse, because starting a process is a heavier operation than creating a thread).

    2. Anonymous South African Coward Bronze badge
      Trollface

      I'm glad El Reg is sanitizing their inputs, otherwise the :(){:|:&};: fork bomb would have hosed the entire site...

      They do sanitize user inputs, do they?

      1. Doctor Syntax Silver badge

        Only a problem if the posts are processed by something which treats them as shell commands.

        1. Prst. V.Jeltz Silver badge

          I've always wondered why thats the case. I can sort of see it with SQL injection,

          but all those cases of "We overloaded 'the buffer' and so this code ran on the remote machine .... "

          it did? wow! how lucky for you !

          amazing to think that cause would lead to that effect .

          1. Killfalcon Silver badge

            Well, the core of it is that in most computers, the buffer is basically a "to do" list for the currently running software. In these attacks, the buffer contents might be something like:

            * An instruction saying, maybe, "print the next thing"

            * then a tag saying the next 16 bytes are a single string,

            * then those 16 bytes,

            * then there will be the next thing to do after you've printed that string.

            The trick is that if you somehow put _17_ bytes in that 16-byte string, it overwrites the next byte as well.

            The computer will then try to do the next instruction... which the attack just replaced, and there's your entry point.

            Not all computers, and not all programs, will actually use the buffer in sequential (or otherwise predictable) order, but it sure is a lot faster if you do, creating the potential risk.

            1. phuzz Silver badge
              Thumb Up

              A good explanation of this is "Smashing The Stack For Fun And Profit" from Phrack 49.

              The technical details are slightly outdated (it's from 1996), but the basic concepts are as relevant as ever.

              1. jake Silver badge

                "smashing the stack" is geek-speak for just one of many types of buffer overflow.

                The tldr version: If you put 10 pounds of sugar into a 5 pound bag, you'll be able to stand on the resulting pile and reach the cookies on top of the fridge.

            2. DadeMurphy

              Then theres the optimal way to do the same thing with a fraction of the effort. The hacker way. Goes a little like this:

              Download a gig of RAM tarbells from the cloud and zip them up. Run netstat on the RAM until the symlinks are immutable. Rinse and repeat. Notice every jpeg on the blockchain has been embedded as HTML bit flips on the virtual machine microservice partition.

              Check your MySpace, and BAM, you have 1 million more friends than you did 5 minutes ago.

              You're welcome.

      2. Ian Johnston Silver badge

        They do sanitize user inputs, do they?

        Of course. Little Bobby Tables is all grown up and works as a consultant for them now ...

      3. JulieM Silver badge

        If you can read the rest of this message after `/bin/poweroff` then yes, they sanitise their input.

        1. jake Silver badge

          $ file /bin/poweroff

          /bin/poweroff: cannot open `/bin/poweroff' (No such file or directory)

          $

    3. Joe W Silver badge

      try

      rm -rf .*

      (better yet: don't try it). The stupid younger me tried to remove some hidden files....

      1. vogon00

        Mind your manners with the physical IDs..

        I deal with all sorts of disks that need re-purposing for test builds, image deployment etc. As a result, I use Microsoft's diskpart.exe to 'Un-initialise' disks quite often...using it's 'clean' operation.

        I automate a lot of operations, but sometimes manually is the only way to do something. My code and comments are full of dire warnings about using 'clean', and the pre-execution checks are very robust, erring on the paranoid side of things:-)

        There is an entry in our local 'wiki-ish thing' about re-initialising physical media and it's full of warnings like '[BE EXTRA SPECIAL BLOODY CAREFUL!]' and '[SERIOUSLY, BE REALLY CAREFUL - THERE IS NO WAY BACK FROM THIS]'.

        So far we've been lucky and the warnings have worked...which is a shame from my POV as I'm in favour of some of the 'as fast as possible, ask no questions' types learning a really hard lesson sometimes - my rules are 'You fuck it up, you fix it' (Right up to the point where people fess up to not knowing how, at which point we do actually help - and educate - them. Two lessons for the price of one!).

        Doing a 'detail disk' and reading the output is now more or less muscle memory for me:-) I'm just as cautious with re-initialising media on Linux...blkid is your friend..

        1. An_Old_Dog Silver badge

          $ sudo dd if=./cz.iso of=/dev/sdc bs=16384 ... "ess-dee-SEE?!!" Ctrl-C Ctrl-C Ctrl-C ...

          After overwriting my main system's primary drive (instead of the intended USB stick target) with a Clonezilla ISO image, I put together a tower system which had pop-open hard drive trays, a tape unit, two CD/DVD units, and no internal hard drive.

          This computer's sole purpose in life is to act as a host for data transfers, data recoveries, and hard drive testing. Lacking an internal hard drive, you boot it via CD/DVD or USB stick. Having a brain-fart while using dd, clonezilla, or badblocks on this PC is far-less potentially-disasterous than doing the same on one of my other computers.

      2. BobTheIntern

        You can live on the edge with a slightly more complex version affectionately known as "bash roulette":

        alias roulette='[ $[ $RANDOM % 6 ] == 0 ] && rm -f $(shuf -n1 -e *) && echo "BOOM" || echo *Click*'

    4. C R Mudgeon Bronze badge

      There's a wonderful feeling of freedom when you've done a test O/S install onto a new box, and so can type such commands at will.

      Also:

      echo "Scratch; can be repurposed" >/dev/sdc

      (after triple-checking, of course, that sdc is in fact the USB stick I have in mind.)

    5. Scott Wheeler

      RECOVER

      Sounds safe, doesn't it? Just the sort of thing that should be the matching command for BACKUP under MS/DOS. What it actually does is delete any directory information, then reconstruct what files it can as FILE0000 to FILE0127 in the root directory. You had more than 128 files? Well now you haven't.

  4. Anonymous Coward
    Anonymous Coward

    Backups

    I'm not a DB person, but as an on-site engineer, I once turned up to a customer site where a DB had gone titsup. My first question was "Do you have a backup?" which was answered by someone proudly holding up a box of tapes. The next question was "How do we restore from those?" This was answered with embarrassed silence and looks of complete confusion.

    Luckily the DB man was on his way and turned up to fix everything without needing the tapes.

    1. Anonymous Custard
      Headmaster

      Re: Backups

      As the old mantra goes - a backup isn't a backup until it's tested and proven to restore...

      1. MarkB
        Alert

        Re: Backups

        A friend of my worked as an IT consultant and told of a company he occasionally visited, where the head of IT would pick a random day to turn off a random machine and tell his team to recover, just to prove that their processes were correct. He must have had a lot of confidence (and balls of steel).

        1. Tom Chiverton 1

          Re: Backups

          Doesn't NetFlix have a Chaos Monkey script that does just this?

          1. Arthur the cat Silver badge

            Re: Backups

            Chaos Monkey plus bigger and better (worse?) Chaos Simians. IIRC Chaos Kong will take out entire regions.

        2. Stumpy

          Re: Backups

          Back in my early days as a VMS operator, the IT director once came down into Ops central, marched onto the machine floor and boldly flipped the Big Red Switch that switched the power off to the entire data floor.

          Cue clenched sphincters as we waited (and waited ... and waited) for the backup generators to kick in before the UPS died. Then they marched out and simply said, "We've had a power failure. Call DEC and put the disaster recovery plan into action."

          This was, apparently their way of conducting a full resilience test - no, DEC had not been pre-informed of the test either - as far as they were aware, it was a genuine disaster - and the recovery plan involved them trucking in duplicate hardware for all our key machines on what was effectively a mobile data centre. Must have cost [i]someone[/i] a hell of a lot of cash to put that thing into mobilisation.

          1. Doctor Syntax Silver badge

            Re: Backups

            "a hell of a lot of cash"

            That would be a regular payment to DEC for DR cover, presumably with provision for occasional tests.

            1. jake Silver badge

              Re: Backups

              Yes. And as a one-time member of DEC's so-called "flying squad", I'm here to tell you that we were often advised that, although it would be quite functional, and would lead to overall efficiency of IT operations at such sites, we were NOT allowed to throttle the IT Director. Yelling at the twat was allowed, however, which I'm sure helped minimize our blood pressure.

          2. emfiliane

            Re: Backups

            Can you imagine if companies provided the budget for something like this, instead of cutting everything to the bone in the service of Just-In-Time Lean Diets with zero slack space for anything going wrong? For one, the ransomware epidemic would simply not exist, everyone would just shrug and restore from backup like they did every few months.

            1. WanderingHaggis
              Alert

              Re: Backups

              Almost right. I once had to persuade management (and the web developer) after a bad SQL injection attack on our main website that it wasn't enough to simply restore the site but that it had to be offline while the failings in the code were fixed otherwise we would be on line for 15 minutes before crashing again (leaking all sorts of confidential information and your truly cleaning up the mess again). You can restore the data but you must fix the hole or the ship sinks again. Fortunately I succeeded but only just.

            2. l8gravely

              Re: Backups

              Sometimes it helps to ask them why they're paying insurance premiums then, since we're obviously never going to have a fire/flood/catastrophe so it's just wasted money....

              1. Anonymous Coward
                Anonymous Coward

                Re: Backups

                And sometimes you find out that they wondered the same thing, and cancelled the policy - right before Hurricane Sharknado blew through the building, flooding the equipment while taking huge bites out of your sammich.

          3. Alumoi Silver badge

            Re: Backups

            Now that's balls to the wall testing!

          4. jake Silver badge

            Re: Backups

            I occasionally build data centers for a living. As a result, I get to test that Big Red Button, pretty much whenever I like (it's in the contract) ... It's not nearly as much fun as you'd think.

        3. Killfalcon Silver badge

          Re: Backups

          I don't know if they still do, but at one point Amazon had a process that would intentionally crash random servers to make sure all the fail-over stuff was done right.

        4. Jenny with the Axe

          Re: Backups

          I used to work at a bank, where there were regular disaster tests. Disaster was simulated by shutting one of the data centres down and checking if all services were still available as they should. Next time they'd take another data centre. Yes, in the daytime (though on a weekend when the securities exchange was closed).

          That's just one of the reasons I felt proud to work there.

        5. C R Mudgeon Bronze badge

          Re: Backups

          On a somewhat related note, I read a paper about a decade ago that advocated _against_ orderly shutdowns. The author's position was that systems should be architected in such a way that an uncontrolled shutdown not do any damage, and that to test that robustness guarantee, one should always shut it down hard -- and to ensure _that_, one shouldn't even create a clean-shutdown facility in the first place.

          E.g. for a desktop O/S, that would mean no "shut down" menu option, command, or whatever; a hard power-off would be the correct, documented way to bring the system down.

          (I've tried searching for that paper again, but without success. Can anyone point me at it, by any chance? I'm pretty sure it was an academic paper, not a blog post or the like.)

          1. An_Old_Dog Silver badge
            Windows

            All Dirty Shutdowns All the Time

            This is a bad idea, because it requires the programmers of every app you run to be perfect.

            A successful clean shutdown requires more than file system consistency. It also requires application state consistency. If halfway through your database update run, the OS force-closes the app's files, then terminates the app, well yay, your filesystem is clean, but your app state is dirty. Yes, I've heard of checkpoints, and well-written apps make use of them, but don't expect the programmers of QuakeCrysisDukeofDuty to put in a superfast emergency gamesave feature. More-mundanely, you also will lose some changes and text in that program you were editing. Going back to the last good save, and trying to remember all the edits you did since then will be a painful and error-prone "recovery" method. Perhaps your editor keeps some sort of edit-transaction-log you can have it replay. Does your editor do that?

        6. jake Silver badge

          Re: Backups

          Back when I worked for Bigger Blue (late 1970s), on Fabian Way in Palo Alto, we'd kill the mains power at 3PM on the last Friday of every month to ensure that the battery would carry the load long enough for the genset to warm up enough to take over. In the event of failure, everyone went home early with two hours pay ... This last never happened while I worked there.

          1. Antron Argaiv Silver badge
            Happy

            Re: Backups

            It's important to remember to check the fuel tank on the genny periodically, to make sure all that testing hasn't left it nearly empty...

        7. Peter Gathercole Silver badge

          Re: Backups

          I briefly worked as a relief system admin while they recruited someone permanent in a company where the manager did that. It wasn't just turning a system off, he'd also sometimes just unplug random cables.

          Fortunately, his idea was to test how the environment coped, and also to see how long it took us to identify what he'd done, in order to get it working again, rather than a full system restore test.

      2. Tim99 Silver badge
        Coat

        Re: Backups

        I prefer "There are no back-ups, only restores"...

    2. Anonymous Coward
      Anonymous Coward

      Re: Backups

      I seem to vaguely recall that back in the mists of time one of the big backup software houses released a version that produced corrupt backups but this wasn't spotted for months until people started to need to use those backups, only to find they have months of meticulously taken backups none of which actually worked

      1. John Brown (no body) Silver badge

        Re: Backups

        I once got called to a customer hardware problem. Job: replace the failed tape drive. Got there, confirmed the drive was faulty, swapped it out, set a test backup going. Failed. Tape beyond expiry date, system refuses to use it. Ask for another one. Same. Check their actual backups. Same. They'd expired about a year earlier.

        Turns out the backup was scheduled to run at 9pm, so someone was tasked with swapping out the tape before leaving for the day. It went in ok, but instead of being ejected after the backup completed, it was rejected and ejected at 9pm when the backup process started and immediately aborted. When I reported this to their head IT admin, it turns out the emails were being sent to an account of someone who left two years ago and it was highly likely this was happening at every one of their 120 remote offices. Oops!

        Let's not even bother with asking if they ever tested their backups :-)

        1. An_Old_Dog Silver badge
          Facepalm

          Consequences of irregular maintenance

          Back in the day, I was ribbed by my cohorts at one installation for backing my programs up to paper media (punched tape, punched cards) instead of far-more-convenient magnetic tape. My thinking at the time was that punched tape readers -- every ASR-33 Teletype had one -- and card readers were far-more prevalent and standardized than magnetic tape units.

          But there was an aspect I had not thought of. This installation's budget was largely dependant on government grants, and there were a few lean years during which management deferred what previously had been regularly-scheduled hardware maintenance. Eventually we got another big chunk o' money, and the DEC hardware techies were scheduled to come in over the next few weekends. The first weekend they maintained our four TU77 tape drives, which included head alignment. Come Monday, jobs started failing due to tape read errors ... d'oh!

  5. MiguelC Silver badge
    Facepalm

    At least this guy learned from his own and his employer's mistake

    In the late 90's I was called to rescue an occupational health company that had all of their information (and when I say everything, I really mean everything: clients, contracts, test results, payroll, every last bit of business information they needed) in a single 700 MB Access file. An hardware failure crashed the .mdb and their most recent backup copy was over a month old. Unfortunately (for them), we were unable to restore it properly, only managing to salvage parts of tables' content, unlinked to anything else.

    They ended up losing several contracts over the issue, but do you think they learned the lesson?

    Well no, they rebuilt from the backup copy and manually inputted all the missing information from what we'd recovered and their paperwork, keeping everything else as it was....

  6. ColinPa

    Dont touch grandmother

    In the days of 3340's which you could physically pickup and mount/unmount (and looked a bit like the starship enterprise), spinning disks etc.

    One of our testers who was an operator in a previous job, had had problems with the disk containing the master database for the banks customers. He called over the senior operator who said.... we had better try it on a different drive in case the drive is suspect.

    It didn't work there either - so it must be the disk. The got out the mother disk. Yesterday's database is copied to a different disk and the batch update run to make today's database (so Mother database begats today's database).

    That didnt work either, so the senior op got out the Grandmother disk from the manager's cupboard. Mounted it - and it didnt work either.

    So they phoned the manager who said "that's ok - just do not touch the grandmother disk".... "Ahhh too late - came the response".

    There had been a head crash on the original disk.

    Mounting it on a different disk drive damaged the heads of the second disk drive.

    The mother disk was corrupted by the damaged heads.

    The grandmother disk was then damaged by the damaged heads.

    Fortunately they had a copy of the database which was only a month old, and could reapply the overnight changes which took about a week to do.

    And that's when the tested decided to join our company where he could do less damage.

    1. John Styles

      Re: Dont touch grandmother

      I remember something very like this happening with PDP-11s in the late 80s - a defective head destroying multiple disks. (I personally did manage to destroy a tape, I have to admit, but didn't lose anything).

      1. C R Mudgeon Bronze badge

        Re: Dont touch grandmother

        Yup, and on the Honeywell mainframe at my university. Two drives and three disk packs later...

        Fortunately, I was nowhere near the machine room at the time.

        Twice-daily tape backups were robust, so I don't suppose much data was lost.

        (Perhaps as a result of that incident, but I'm not sure) one guy made a point of eyeballing disk packs before mounting then, looking for dust etc., and taught me to do the same. Not sure how effective that precaution might actually be, but it couldn't hurt...

    2. Data Mangler

      Re: Dont touch grandmother

      Oh dear. That reminds me of an embarrassing moment in an open plan office. I was on the phone to a customer who managed to dig himself into a hole. It was a bad line and I had to speak loud. Suddenly, the entire office went quiet as everyone, by chance, stopped talking. It was into this silence that I heard myself bellow "Have you mounted your grandmother?".

      It took about three seconds for the first chuckle to be heard, but within 15 seconds the entire office was filled with guffaws.

  7. AMBxx Silver badge
    Stop

    Should have used a cloud DB

    Then someone else could have dropped the database for him!

  8. chivo243 Silver badge
    Thumb Up

    Best practice for fsckups

    Put your finger up, and admit you royally screwed up!

    Not the thumbs up, but usually the index finger, shakily...

    1. rototype

      Re: Best practice for fsckups

      Yup, managed to do this twice where I am now and fessed up immediately. First time I was just a contractor and I reckonned I'd be looking for another job next day. Turns out I told exatly the right person who then shouted over to another tech 'Hey, you know that button in X that we should never press..." and all had a sigh of relief that it wasn't one of them that'd done it. They managed to get most of the data back and the rest was rebuilt over the next few weeks but it was noted when I had my interview to go permanent that my handling of the incident was one of the reasons I was being made permanent.

      My other memory of that interview was there was a poodle sitting in on it (yes, a real one - but that's a story for another time).

      Second time I realised within a few seconds what I'd done (removed the access group instead of removing the user from it) and contacted the one pwerson who I knew could do something about it ASAP - as a result no major fallout and another potential hole removed.

  9. Doctor Syntax Silver badge

    Stephen, whether he liked it or not, was, in effect, the DBA for the system. After all, there was nobody else in that role. That applies to anyone else in that situation.

    In that situation he needed to acquire the two essentials. No, not what you're thinking, definitely not those.

    1. The required level of paranoia. (Extreme)

    2. Detailed knowledge of what he's doing.

    Note the order.

    The database is the equivalent of all the paper records the business might have had otherwise. Operating on it is the equivalent of opening all the filing cabinet drawers and peering into them with a lighted candle in one hand and a jug of petrol in the other.

    1. nintendoeats

      You know, when you put it that way, it makes one think that the world needs databases with an immutable change history that is only destroyed in the case of the storage medium itself being destroyed/wiped.

      1. emfiliane

        That's literally what the transaction log is, and it can be stored over the network rather than locally. Sure, some software keeps an immutable ledger at the application level, but those have no protection at the console level; a transaction log is still the only true immutable history of state at any point at that level.

        Prune a transaction log without a tested restorable backup at your extreme peril.

      2. Doctor Syntax Silver badge

        It's an interesting concept. What do you do in the event of a legal requirement to remove some data, for instance a data subject right ot be forgotten request?

        It's the potential of destruction of media, or even the entire H/W that needs to be dealt with. Before I moved into IT I'd had the experience of my workplace being bombed (fortunately not very effectively) and burned (rather more effectively) so my subsequent thinking was more in terms of getting regular backups into a fire safe and preferably off-site.

        1. Killfalcon Silver badge

          There's a few approaches.

          First is make sure that your transaction logs include the removal - that way, any restoration done using the logs will comfortably re-delete the data.

          Second depends a lot of the business justification clauses of the relevant laws. Like, "will the Tax Office expect these records to be available" is a question my lot ask themselves, and if the answer is no then you can prune the transactions. If it's yes, then the logs will have to stay for the usual period (varies by what you're doing, but IME it's rare to have any valid use for identifiable customer data older than 7 years or so).

          I've talked with a friend where all customers identifiable stuff is hashed, and a master table is used to decrypt each individually. To delete a customer, you just kill their line in the hashing table, and bam, every single bit of information that's PII is inaccessible to you. This means you can keep policy numbers or invoice dates/amounts to make your accounts add up, and just lose the customer's name/address/DoB/etc.You do pretty much need to build everything around it, mind, it's not easy to retrofit.

          1. nintendoeats

            I think there is an issue with having the removal in the transaciton log, since either the data still exists in the transaction log (in which case it has not truly been deleted) or the data is truly destroyed, in which case we are right back where we started with a mutable history.

            There will always be friction between the goal of "make it impossible to accidentally permanently delete things" and the legal requirement of "permanently delete certain things".

            1. Killfalcon Silver badge

              Yeah, approach 1 is debatable. The way the law is written you can retain business necessary info, amd right now there's not that much case law. Is possible the ICO would accept "we retained a restore in case of supeonas, but the data is unavailable to normal processing", it's possible they might not. Likely depends on the nature of the business.

        2. nintendoeats

          See my comment above on your first point.

          Of course destruction of media is not an issue that can be solved in the database itself (well ok, there is stuff like ECC, but that's not a cure-all). I see it more as, lets separate the concerns of database interaction fuckups and hardware fuckups. Defense in-depth and all that.

      3. Binraider Silver badge

        WORM tape backups have their uses.

        Quite popular in the legal profession for this reason in fact.

    2. C R Mudgeon Bronze badge

      That's a suitably sphincter-tightening analogy. I'll be borrowing it, should the need arise.

    3. C R Mudgeon Bronze badge

      What could go wrong?

      "1. The required level of paranoia. (Extreme)"

      Indeed.

      My home backups take place ad hoc, since the backup server doesn't run 24/7. I power up the latter and run the script when I think of it.

      One time I set out to transplant my laptop's drive into a new laptop. Take the several hours to back it up first? "It's a simple thing I'm doing. What could go wrong?" But, being the impatient but paranoid sort I am, "OK, _fine_," I sulked at myself. "I'll do the right thing: run a backup overnight, and move the drive in the morning."

      What could go wrong? Post-move, the drive failed to spin up. Nor when reinstalled in the old laptop. Dead, kaput, pining for the fjords, yada yada.

      No new lesson was learned, but suffice to say, one that had previously been learned the hard way was powerfully reinforced.

      Side note: introductions to computers often talk about the CPU being a computer's brain. Whatever. The HDD (or latterly, SSD) is its soul.

      1. Tim99 Silver badge
        Windows

        Re: What could go wrong?

        I'm retired. This year, I have successfully got all of our IT down to two iPhones, two iPads and one iMac with some stuff in the cloud. The only content stored locally on the iPhones and iPads has been downloaded from the cloud. The iMac has three 16-day rotated Time Machine backups (one in the firesafe), two monthly rotated Carbon Copy Clones (one off-site), and two monthly rotated file copies of the cloud (one in the firesafe), and one disk with everything copied annually off-site. I replace 1-2 disks a year. Total cost about £1,000.00. Paranoid - Me? No, I I've just been using IT professionally for 51 years...

      2. Trixr

        Re: What could go wrong?

        Slight modification - its memory.

        The brain may function, but like someone with Alzheimer's, if the storage or connectivity to the storage is no good, things fire up but the inputs are all scrambled/meaningless.

      3. Doctor Syntax Silver badge

        Re: What could go wrong?

        "My home backups take place ad hoc, since the backup server doesn't run 24/7. I power up the latter and run the script when I think of it."

        I recommend a Pi, a large USB drive and NextCloud with the NC client software set up to sync all the directories where you might put things. The server has an area shared with SWMBO so that when I put together her class notes PDFs she can see them to email out to the class.

        I still recommend backing up your /home partition or whatever other location your OS keeps your data on before doing anything drastic.

  10. Anonymous Coward
    Anonymous Coward

    Not a database - the whole system

    I recall, many years ago, taking over IT admin for a small organisation that ran everyone off a single server. When the original IT manager left, the General Manager passed IT admin over to the office manager. She was sent on a three-day course and, on her return, followed advice given and updated all the system passwords. Backup tapes were changed daily and weekly and stored in a fireproof safe - but never actually tested (neither did she look at the backup logs). Following an IT issue she couldn't solve I was handed the IT role (part time as I was actually employed in a non-IT role). After fixing her problem (it must have been straightforward, as I don't recall what it was) I started to delve into system logs and quickly spotted errors reported for backups. Further delving and it transpired she'd not given the backup program the updated passwords and all the backup tapes for the six months she'd been in the IT role were substantially blank. At least I wasn't going to need to buy any new tapes for a while, once I got the backups running properly...

    1. Doctor Syntax Silver badge

      Re: Not a database - the whole system

      We keep coming across these things.

      Backup consisted of an overnight copy to the hot standby at the other end of the site. I don't remember the details - maybe it was a change of permission that allowed/disallowed read access to the backup UID - but the backup would be terminated before the morning shift started. For a very long time the overnight slot hadn't been long enough for a complete copy and nobody had checked...

      Fortunately this was belt and braces - there was also a tape copy but I'm not sure the tape formats were compatible between the two machines.

  11. Anonymous Coward
    Anonymous Coward

    Drop and go

    Anon as I work for the company still and so do the other people.

    There was an issue with an application running on a database where it hadn't imported data for some days, our overseas team that wrote the software gave a bunch of fix steps to run through the first of which involved "drop", the rest of it "worked". The person running the fix steps is not a DBA and does not know what the commands meant, was just told, it would fix it.

    Queue the customer saying, "there's no data".

    Cue the restore and then replaying multiple days worth of data that took a couple of days to complete.

    DBA rights promptly removed that day to everybody that was not in fact an actual DBA.

    Also, steps that any "run these list of commmands" issued from any other team was reviewed by multiple people to determine what it would actually do.

    Fortunately, I did not get sucked into that maelstrom, just managed to snigger from afar.

    1. Tom 7

      Re: Drop and go

      It's always amazed me the number of people running DBs that dont seem to understand they can pretty much disallow almost anything in their DBs. Its almost as if management and sensible security are somehow at odds with each other.

      1. Anonymous Coward
        Anonymous Coward

        Re: Drop and go

        One of our customers has a database which stores, well, basically everything, including all the sales for both their website and physical stores.

        It has one, single, user, which has full rights to everything. The PoS software uses this admin to write it's sales into the DB, as does their web store, and everything else around the company.

        Fuck knows if it's even being backed up anywhere.

        1. Trixr

          Re: Drop and go

          If it's anything like where I work, yes, it is being backed up. Into a backup which has never been tested, while the backup account is an AD Domain Admin.

          I was actually shocked to find that the DB service didn't use the same cred. No, the "SQLService" account is running ALL the DBs in the entire org, multiple DB farms, scores of servers and applications.

          Yes, I have got copies of the emails where I've pointed this out at length to managers and security team, multiple times.

      2. yetanotheraoc Silver badge

        Re: Drop and go

        "they can pretty much disallow almost anything in their DBs"

        Quite. Stephen failed to backup the DB probably because he was not logged in as a backup operator.

        "Surely I'm smarter than this issue," Stephen thought to himself.... Nope. You don't know what you don't know. Once the software has demonstrated your ignorance to you, it's time to step back and RTFM (or consult the greybeard, call the vendor, etc.). When I was a callow youth I (oh so delicately) brushed the horse fence with the backhoe. The owner was understandably angry, but gave me a valuable piece of advice: "When you are unfamiliar with the equipment, GO SLOW!"

      3. vogon00

        Re: Drop and go

        Its almost as if management and sensible security are somehow at odds with each other.

        That's because they are. It's rare to find a gaffer that sees 'security' as an asset instead of an expense, especially in budget-strapped SME-land.

  12. Anonymous Coward
    Anonymous Coward

    Ahhh.

    Like the time I received a call from a customer telling me our software wouldn't talk to their database. The conversation went like this:

    Me: Can you see if the database is running.

    Customer: It seems to be.

    Me: Hmm. OK. What processes can you see.

    Customer: a couple of ksh processes.

    Me: That's not your database. Can you try and start it.

    Customer: I don't know how. I don't usually do this job.

    Me: OK - cd into this directory....

    Customer: I can't - it's not there.

    Me: Huh? Where is it??

    Customer: Well, we had run out of space so I deleted some files....

    Yep - the customer had deleted the entire database. Not only that, they hadn't had a successful backup for several weeks!

    1. Killfalcon Silver badge

      Not quite as bad, but I had a manager "tidy up" some things - rename the folders to the department's new name, arrange things in a more logical and clear fashion, making things vastly easier to find and navigate... and breaking hundreds of links between documents in the process.

    2. J.G.Harston Silver badge

      "Well, we had run out of space so I deleted some files"

      That is sometimes (well, often in my experience) caused by crappy application software. It does:

      errorhandler: if err=cantsave print "The disk is full - delete something"

      instead of

      errorhandler: reporterror; if err=cantsave print "Couldn't save"

      I had this where the underlying error was "user account run out of allocated space", *NOT* "disk full". The solution was to credit the user with more space, not to go trawling the disk desperately deleting things, and wondering why the damn thing STILL wouldn't work with a disk 99.99% empty.

      1. el_oscuro

        That is just shitty error handling - I fail shit like that in code review all the time. Why not?

        errorhandler: if err=cantsave print $actual _error_message

        Swallowing error message is probably the biggest source of bugs these days.

        1. J.G.Harston Silver badge

          And get rid of the IF as well. Always report the actual system-generated error message, and then *afterwards*, *if* there is a need to do so, check for any specifics.

  13. Anonymous Coward
    Anonymous Coward

    Used to work for a company (a long time ago) that religiously pigeon-holed roles and responsibilities. Even down to scripts that did something as simple as a backup!

    So I wrote a script that backed up data in the following way.

    - Write a header describing the backup contents

    - Write my application files

    - Call a DBA written script to back up the database

    - Write a trailer file to show the backup was complete

    Worked great for months and months. Then one day we had a problem that required a restore. Off I trotted to the tape store, put in the tape, read the content expecting to see a header, my stuff, a DB dump, and a trailer file - only all I could see was the trailer file. Odd! So I go back 3 MONTHS - all the same problem. So I am confused. I changed nothing. I verified it all worked fine during testing and validated for weeks after it went live. I was in for getting a bollocking....

    I did a bit of investigation and finally found that one of the DBAs had made an unauthorized change to their script, and made the tape device a rewind device instead of a no-rewind device. After his DB backup, it rewound the tape to the beginning just for me to write a trailer file!

    1. John Riddoch

      Oh, that old chestnut - many, many people have had useless backups because of a missing "n" in the device file....

    2. el_oscuro

      This is why you practice restores all the time. Use them to create standby databases, or simple restore them to /tmp or somewhere to make sure you can. If you haven't practiced restores recently, you don't have backups.

  14. An_Old_Dog Silver badge
    Megaphone

    In-Shop Backups

    I worked at a PC shop where we had a policy of having people sign a waiver stating they'd made a full backup of their system before we'd work on it, or, if they hadn't, we'd make one for them at standard shop rates.

    One day I found myself in a race with a plug-your-ears, shrieking-bearings, dying Miniscribe 3650 hard disc to complete the backup before it froze. I think I won -- the restore to a new hard drive completed, a CHKDSK showed no file system inconsistencies, and the customer didn't call back complaining about missing or corrupted files ...

    (Icon for excessive noise ...)

    1. Anonymous Coward
      Happy

      Re: In-Shop Backups

      Not specifically DB related, but when I was on tech support for Dixons/PC World virtually every computer for a while*** was supplied with an unimaged recovery sector and disk-imaging utility, and no Windows disk. It was a cost-cutting exercise so retail prices could be kept down (or profits kept up, depending on your level of cynicism).

      The first time each machine was powered up it asked you to create the recovery image, which in most cases was burned to CD/DVD. However, you could cancel it. And most did.

      If I remember, it was time-sensitive, and you could only create the image for so long after initially registering Windows.

      I can count on the fingers of one hand (exaggerating) the number of people who'd actually done it, and the ones who had often only did so after they'd called in early on and I warned them to do it pronto (and stick the disk in a sleeve, then in a box and a safe place, but not in the cutlery drawer, because it was rather important). Most didn't.

      And don't get me started on the reliability of the imaging process when run, or of the reliability of the alleged image when used later in many cases. We had to send out replacement master disks regularly.

      *** And in a different phase of machines, the recovery sector simply remained on the hard drive, which was fine unless the disk failed. Worth noting that one of the main parts replaced by us was... hard disks.

      1. Korev Silver badge
        Pint

        Re: In-Shop Backups

        My family actually did this... to floppy disc! There must have been about 50 discs...

        Do I get one of these as a reward -->

        1. This post has been deleted by its author

        2. An_Old_Dog Silver badge
          Angel

          Re: In-Shop Backups

          If you have to back up an MS-DOS PC to floppies, Fifth Generation Systems' "FastBack" program is your friend.

    2. Tom 7

      Re: In-Shop Backups

      We had one dept in a dark corner of the building that ran their own little obscure system and ran their own backups to a tape machine. When called on to help with a problem I was rather surprise to see the absolutely drop dead gorgeous girl in charge of backups pop the tape out and put it back in again. Turned out they'd only got one tape and when the system got big enough to need two tapes they just popped the tape out and put it back in again when the machine asked for Tape 2. Not surprisingly the problem was solved and due to the application of the absolutely drop dead gorgeous girl in charge of backups protocol the whole thing was solved with immense dedication and thoroughness, Indeed the protocol even now slows my typing to a crawl as no PFY or even BOFH wanted to leave her presence faster than necessary or even consider incurring her displeasure. In her 'defence' she was actually very good at her job bar the backups, which were probably handed over to her to prevent PFYs getting lost or malfunctioning for the rest of the week,

  15. Prst. V.Jeltz Silver badge

    This writer well remembers an occasion several decades ago where a DBA was so convinced that a running database meant database files would be locked that a swift DEL *.* could be used to remove redundant files. There were no backups then either.

    Man , that guy had some balls!

    1. Doctor Syntax Silver badge

      But few brains.

      1. Charlie Clark Silver badge

        For a while it was common for the RDBMS to own the file system, some even had their own file systems… While that didn't make backing things up any easier, it also made deleting stuff harder.

        I've hosed a Postgres DB in my time (always able to restore the data) and was surprised at how well I was able to restore from a disk backup.

        But what got me about the story is that the update instructions included a DROP DATABASE. That's a mistake in any language. At the minimum there should be backup and restore steps. Don't touch it if there aren't! But, also, where was the test? No production system should be changed until everything has been shown to work on the test. If manglement won't approve this then it's time to get out because you will get the blame when it inevitably does go wrong.

    2. MiguelC Silver badge

      careless deleting of random stuff

      I remember sometime ago a co-worker learning that deleting records from an in-memory array meant it could also delete them from the database, depending on the parameters used for creating the damn thing.

      Not a fun weekend for him, recovering information from backups and transaction logs but at least he learned, as did others (by example, natch), that everything should be thoroughly tested before deploying in production, even seemingly small changes.

    3. An_Old_Dog Silver badge
      Coat

      Back it Up

      One of our clients had an MSDOS-based computer, with a 40MB streaming tape unit. We showed the PC user, an accountant, how to make backups, and how to test them. He nodded happily, and thanked us for our efforts.

      Three years later, their offices were broken into, and that PC was physically stolen. Turns out, he'd made a grand total of ZERO backups. They fired him, but that didn't fix their problems.

      (icon for "being shown the door")

  16. Prst. V.Jeltz Silver badge

    Being a careful chap, he also took a copy of the directory of the application from the Programs Folder. Just to make sure he had a copy of the data.

    When i first read that I assumed he knew for damned sure the file(s) he was backing up was the file(s) holding the data.

    Seems not . you' think he'd know by the size if nothing else.

  17. BenDwire Silver badge
    Facepalm

    Needs rebooting ?

    Software that I don't touch save when there's an issue, needs rebooting, etc

    I ran our entire company ERP system off a Postgres database, which never needed rebooting in all the years it was in use under my control. I was a bit perplexed until the mention of the 'Programs Folder' when it twigged that it must have been running on a Windows machine.

    In my experience Postgres is a fantastic bit of software that 'just works', especially if running on top of a stable Linux distro. Backups were automated and restored daily as a almost live 'play' system, allowing authorised people to try things out to see if it was going to work as they might hope. After I retired, the new BOFH also kept the transactional data during the day allowing much finer backups if required (they weren't) and I understand he's now virtualised everything too, without a hiccup.

    I'm no DBA, but have a long chequered past involving Dbase III programming, Access & MySql. I can RTFM. There is simply no excuse for not reading the manuals and working out the basics before putting any valuable date in there. Cost is no excuse either as we even used a Raspberry Pi to host a copy on Postgres while working from home.

    1. Doctor Syntax Silver badge

      Re: Needs rebooting ?

      "kept the transactional data during the day allowing much finer backups if required (they weren't)"

      But this is why we do backups including the transactional logs. You do them in case they're required for production. A restore to a test system is a bonus. You always hope they never will be required but it's knowing you can do a resotre up to the last checkpoint or, preferably, up to the last commit that lets you sleep at night.

      1. BenDwire Silver badge
        Thumb Up

        Re: Needs rebooting ?

        It wasn't a criticism at all, and I was happy to see him get round to doing stuff I'd never had time to do. (For the record, I was running the company and tended to the IT side in my 'spare' time). Like many small companies we never had the spare cash for a dedicated BOFH - until the rise of Win10, and I decided that I'd had enough.

        Backups are like like most insurance policies, it's essential. but you hope you never have to use them. And bad backups are like cheap insurance ...

  18. Anonymous Coward
    Anonymous Coward

    Backup

    A smart man creates a backup before doing something dangerous.

    A wise man tests his backup before doing something dangerous.

  19. TeeCee Gold badge

    That's "drop"...

    ...as in "Grand Piano" and "Lift shaft".

  20. Anonymous Coward
    Anonymous Coward

    Saved by my own incompetence

    I was once doing support for a CMS on the side.

    Said CMS was never designed for large numbers of users and got seriously bogged down at one occasion. I found that the performance issues prompted many users to create multiple accounts (they never received a confirmation, but user records were still created), and that further slowed the system down.

    So, I had the bright idea to clear out those half-broken accounts until things would easen up a little.

    Opened up the SQL query tool on the server and prepared a DELETE command.

    Sent the command off and then it dawned on me that I had forgotten the WHERE clause...

    Imagine my relief when the query tool refused to execute the command because it was missing the semicolon at the end.

    1. David Hicklin Bronze badge

      Re: Saved by my own incompetence

      "Opened up the SQL query tool on the server and prepared a DELETE command.

      Sent the command off and then it dawned on me that I had forgotten the WHERE clause..."

      Which is why I always do a select blah to check the data that will be effected before doing the delete.

  21. aerogems Silver badge
    Facepalm

    Boooo!

    What self-respecting male asks for instructions!? That's grounds for instant termination of one's man-card!

    But was peripherally involved in a similar situation way back when I was still in school. By virtue of being the offspring of someone involved, I wound up as tech support for a small non-profit (literally two part-time people and me on an as-needed basis). The government program they received funding from required that all the information be entered into this custom DOS-based DB app. It was a real turd as you might imagine from something created by a government organization that didn't specialize in software. IIRC, the entire thing ran off a single floppy drive. Being in my early teens the idea of a backup didn't even occur to me, and this was still very much the era of the 3.5in floppy. The exact series of events elude me now, but one day I get a call because there's a message on the screen congratulating the user on creating a brand new database, and wanting to know if there was anything that could be done to restore the old data. Even if the raw data was still sitting around somewhere, the software designers apparently never considered the possibility that people might want to load in a database from a file, so many hours of tedious manual entry ultimately had to be redone.

  22. Anonymous Coward
    Anonymous Coward

    WTF

    Even if he wasn't a DBA by trade, the fact the backup failed when tried manually should have raised red flags a mile high. He clearly wasn't "smarter than the issue" and mea culpa or not, he should have been out on his arse for gross negligence after blindly continuing with instructions he thought he vaguely remembered

  23. Boris the Cockroach Silver badge
    Facepalm

    Every time I see

    'drop' and 'backup' brings back one the programmers doing a recreation of 'the Odessa steps" sequence , only instead a pram its one the laptops going down the stairs from the conference room.

    Bang

    Bump

    Bang

    Bump

    Crash

    Of course we have backups.... new laptop duely found, CAD software installed, just link it to the server and away we go ....

    "What do you mean? you cant find your models or files? where were they?...... SAVED ON THE DESKTOP OF YOUR OLD LAPTOP???!!!!!!!! FFS you are kidding.. what about the directory on the server where everyone else saves stuff?"

    Cue retrieving dead laptop from skip.. and pulling the HDD for copying

    1. jake Silver badge

      Re: Every time I see

      "Cue retrieving dead laptop from skip.. and pulling the HDD for copying"

      What kind of company throws a laptop in the skip without destroying the HDD?

      1. Anonymous Coward
        Anonymous Coward

        Re: Every time I see

        One with a last-ditch skip-stored data recovery plan, natch.

  24. el_oscuro
    Mushroom

    As a 30 year Oracle DBA

    The very first time I logged into a production Oracle database - was to restore it for a client I had never worked for. It seemed that Oracle support had sent their recovery consultant and he said they would lose 2 years of data and there was nothing that they could do. So the client called me and asked for help. I asked different questions and got a full restore and recovery.

    A year later, I joined Oracle and was at an awards ceremony where that consultant was awarded "fireman of the year".

    Since then, I have restored many other DBA mistakes when they didn't have proper backups. But with Oracle, as long as you have archive logs, you can probably restore. I have never failed to restore a production database.

    That is because I practice restores all the time as part of routine operations. Because, if you haven't practiced restores recently, you don't have backups.

  25. disgruntled yank

    drops

    Perhaps fifteen years ago, I got a call from a co-worker:

    Co-worker: I tried to query IMPORTANT_TABLE and the system says it doesn't exist.

    Me: [After a quick check] It doesn't.

    Co-worker: ???

    Me: [After another quick check] As a matter of fact, you have about three tables left in your schema.

    This was Oracle 8 or 9, though, and it was quick to get them back from the recycle bin.

    Well before that, some co-workers found that ANOTHER_IMPORTANT_TABLE in a different database kept disappearing. I don't know how that happened, but I wrote them a DDL trigger that would raise an error if one tried to drop certain tables.

    1. Killfalcon Silver badge

      Re: drops

      I had a similar issue (IMPORTANT_TABLE gone AWOL) happen once, and was *remarkably* glad that it was actually some arcane failure in our TNS.

      Sure it took a week for tech support to un-pluck that turkey, but the table still existed for everyone else, I just couldn't see it from my machine.

  26. Rtbcomp

    Two Backup Problems

    1) Upgraded a customer to the latest version of MSDOS. Did a backup, installed the new version, did a restore. Only I didn't do a restore because the format of the backups wasn't compatible between the two versions of MSDOS

    2) Customer running two business on 2 MSDOS computers using the same software. Unfortunately he was using the same set of floppies for each, so the only backup he had was No2 machine as that backup had overwritten the No1 machine's backup. Luckily he didn't have to restore No1 machine before I discovered what he was doing. Why did he only use one set of disks? Because he was a tight-fisted accountant.

  27. Anonymous Coward
    Anonymous Coward

    Data snapshot shortcuts

    I knew a VMware admin whose backups were VM disk snapshots. Luckily I was a mistrustful fellow and had 2months prior requested the vCenter metadata export and offsite+offline full diskdump of the data stores.

    Since I used to do daily test reports i had all infra details to the hour mark ... Come the inevitable VMware patch and subsequent data disk crash ... We spent near 5 weeks restoring config and infra with 2 month old data ... New stuff we copped up to data loss.

    Moved out after and site was running 2 years unattended due to good rotate and cleanup and auto recovery scripts we developed and configured.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like