back to article DBA made ten years of data disappear with one misplaced parameter

Welcome once again, dear reader, to Who, Me? – the cathartic corner of The Register wherein, once a week, we hand over to our readers, such as yourself, so that they may unburden themselves about times when things did not quite go according to plan. This week's confession comes from a reader we'll Regomize as "Larry". Some …

  1. A.P. Veening Silver badge

    An Oracle DBA regomized as "Larry"? Somehow that smells suspect.

    1. b0llchit Silver badge
      Coat

      That smell has sailed.

    2. Anonymous Coward
      Anonymous Coward

      thatsthejoke.gif

    3. Jou (Mxyzptlk) Silver badge

      He lost ten years of his life by walking into that dark ally.

  2. frabbledeklatter

    Little Larry Tables

    https://xkcd.com/327

    1. wolfetone Silver badge

      Re: Little Larry Tables

      We've all been Little Larry Tables at some point.

      1. Korev Silver badge
        Coat

        Re: Little Larry Tables

        Dataloss, the SQL...

  3. David Harper 1

    This is why we ALWAYS test new procedures on a COPY of the production database

    Even if we are old and grizzled DBAs. Because anyone can screw up.

    1. Doctor Syntax Silver badge

      Re: This is why we ALWAYS test new procedures on a COPY of the production database

      Not even it but because. We also COUNT the number of rows that will be affected. But is Oracle so slow that two month's worth of rows couldn't be SELECTed?

      1. Prst. V.Jeltz Silver badge

        Re: This is why we ALWAYS test new procedures on a COPY of the production database

        I'm becoming increasingly disgruntled with using the "row" as a unit of data . Its vague at best .

      2. aerogems Silver badge
        Coat

        Re: This is why we ALWAYS test new procedures on a COPY of the production database

        Oh just DROP it already. No one needs to COMMIT to SQL puns to that degree.

    2. Gene Cash Silver badge

      Re: This is why we ALWAYS test new procedures on a COPY of the production database

      This is why we have production instances and development instances.

    3. Jou (Mxyzptlk) Silver badge

      Re: This is why we ALWAYS test new procedures on a COPY of the production database

      > Even if we are old

      No, BECAUSE we are old and experienced.

      1. DropBear
        Devil

        Re: This is why we ALWAYS test new procedures on a COPY of the production database

        No, BECAUSE we are old and experienced rightfully paranoid. FTFY....

    4. Mage Silver badge
      Alert

      Re: on a COPY of the production database

      Ideally on a separate computer that's accessed directly with its own keyboard and mouse (No KVM switch etc) and the network unplugged, because you never know :(

      Thinks, am I accessing the test server in the basement of the office in another country, or the live co-located server in the data centre? I know, i'll copy the data to a VM on my own laptop just to be sure to be sure.

      1. Nick Ryan Silver badge
        Coat

        Re: on a COPY of the production database

        Don't forget to wipe your laptop instead of the VM.... :)

  4. Pascal Monett Silver badge

    "Thankfully [..] there were backups"

    With actual DBA's around, I should hope so. This was not the modern days of "move fast and break things" which, funnily enough, I don't they use too much as far as critical data is concerned.

    At least, I hope so.

    1. phuzz Silver badge
      Thumb Up

      Re: "Thankfully [..] there were backups"

      "If you can't restore from a backup, then your data isn't backed up", so by testing their restore procedures, Larry was actually helping.

      (Tongue partially in cheek, because in my last job, if I had to restore from a backup for someone, I'd also tick off my "tested restores" task for the month)

      1. Shalghar Bronze badge

        Re: "Thankfully [..] there were backups"

        Dont call yourself a jockey or cowboy when (schlock) mercenary rules apply surprisingly well.

        17. The longer everything goes according to plan, the bigger the impending disaster.

        41. “Do you have a backup?” means “I can’t fix this.”

        43. If it's stupid and it works, it's still stupid and you're lucky.

        https://schlockmercenary.fandom.com/wiki/The_Seventy_Maxims_of_Maximally_Effective_Mercenaries

  5. An_Old_Dog Silver badge
    Joke

    Appropriate Mis-quote

    "You keep using that keyword. I do not think it means what you think it means."

    1. Anonymous Custard Silver badge
      Trollface

      Re: Appropriate Mis-quote

      Inconceivable!

  6. Pete 2 Silver badge

    Walking on the live rail

    If testing new procedures on the live database is normal in "Larry's" company, I expect the database backup/restore process gets a lot of use.

    Let's hope for their victims customers sake, someone doesn't live-test a flaky modification to the backup process.

    1. Turkey_Bender

      Re: Walking on the live rail

      Remember- everyone has a test environment. Some people are lucky enough to have a dedicated production environment too!

  7. hh121

    As a wise man once said, if you don't have a test system, what you actually don't have is a production system... Truncate was one of the things that caused a sharp intake of breath whenever I hit the return key. I've done worse though.

  8. Michael H.F. Wilkinson Silver badge

    Ouch!

    One more case of "saved by the backup". Last time I lost some important data (only my own LaTeX sources of slides I use for teaching, not other people's work) was when my laptop was stolen and BOTH recent backup disks failed. I had some older back-ups (missing the more recent additions to the sources), and the PDF files of the slides, but it does go to show you can never have too many backups. I now do my LaTeX editing in Overleaf, make regular copies to my (backed up) work machine, AND on my personal laptop AND copy those to two solid-state disks regularly. I do not doubt I will at some point lose stuff.

    1. ibmalone

      Re: Ouch!

      Last time I lost some important data (only my own LaTeX sources of slides I use for teaching, not other people's work) was when my laptop was stolen and BOTH recent backup disks failed.

      "Once is happenstance. Twice is coincidence. Three times is enemy action." - Ian Fleming

      1. jake Silver badge

        Re: Ouch!

        "We have met the enemy, and he is us." —Walt Kelly

  9. CowHorseFrog Silver badge

    sad thing is the IT world is full of idiots, they might not lose 75G of data on adaily basis but they are stupid in so many ways.

    At my work, theres a non prod environment. Genius keep moving application and clusters around to "save" aws resources. Im not sure how much they "save" but its far less than the "time" they waste because stuff is down, people costing thousands a day are waiting, and i havent mentioend the time wasted because urls and more "change". Naturally they then "change" things again a few weeks later.

    1. Doctor Syntax Silver badge

      If they didn't have all that busy-work to do somebody might notice that they're not really needed.

  10. Bebu Silver badge
    Windows

    Daft?

    《 the annual report ... required .... backing up the most recent two months, truncating said months, running an ad hoc job to repopulate it so a report can be generated, then finally restoring from the backup.》

    The last bit is a tad confused but the gist seems to be that annual report was generated two months after end-of-year.

    So I would guess that the code generating report was repurposed from a specified start-of-year to current report so that removing the last two months is to make "current" two months prior (ie end of year.)

    While not a DBA I can grok enough sql to select .... from .... where rec-date between ... and ... and would wonder why database views of the required tables restricted to required period viz omitting the final two months weren't used.

    Dropping views is slightly less dangerous I would have thought.

    I would it would be damn sight safer to parameterize the report generating script with a period starting/ending argument.

    Any data that has been collected, validated and stored (especially in a dbms) is best treated as read only. Even when amending a record to correct an error it is far more desirable to just supersede the erroneous record in order to keep an obvious audit trail. I imagine modern rdbms do this anyway.

    I have been fortunate in my BOFH exertions not to have anything to do with production databases other than very short stints with Unify 4.2 and then Oracle 5 at the outset. I suspect had it had been otherwise the company carpark would have been full of ex-DBAs not so much fallen from grace as from a convenient open window.

    1. Prst. V.Jeltz Silver badge

      Re: Daft?

      " backing up the most recent two months, truncating said months"

      My eyes glazed over at that point and i decided not to try to digest the exact specifics and just carry on with the story, even though that is currently my exact stock in trade . I import / store present data to the dept next door which makes all the pretty pictures out of it, usually via a "view" . These views can easily get last week / month / year / fin year at any point of the current week / month / year / fin year without all that jiggery pokery in the story .

      Like this for example

      WHERE (CAST(payday AS Date) BETWEEN DATEADD(DAY, 1, EOMONTH(DATEADD(MONTH, - 13, GETDATE()))) AND EOMONTH(DATEADD(MONTH, - 1, GETDATE())))

      I just assumed there must be other factors / complications , and skipped to the "lost 75gb" part

      1. yetanotheraoc Silver badge

        Re: Daft?

        Views are ideal but if the report is slow could just save the monthly reports in a table and query whatever 12 months for the annual report. -- "other factors / complications" -- If the annual report needs extra detail just capture the extra in the monthly process as well.

        Truncate... Whoever devised that works with spreadsheets not databases. Have suffered a few hiccoughs myself and sadly one time was not saved by the backups. In my opinion it's best to keep restore for emergencies and find another way for routine business.

  11. I Am Spartacus
    Mushroom

    SO much or Oracle dos not work as you expect

    I was also working with my team of Oracle DBA's. They had set up a backup programme using the Sequent supplied backup system. We were running on Sequent compuyters with Sequent disks. This is important. This was back in the 1990's

    We also had a MASSIVE table. Gazzilions of rows that grew at a rate of round about a million per day. So we had a regime. Each month we create a new partition which we distributed over the disks. Older months tables were set to READ-ONLY, thus we knew queries would not take a lock. This all ran fine. Backups happened automatically. New partitions were created at month end and everyone was happy.

    Right up until the Sequent engineer came in and wanted to takle a list of disk serial numbers. We thought he would know what he was doing, this being a vitally important (and not cheap) production data warehouse. He staretd at the front, and noted all the disks. We went round the back, opened the cabinet doors and knelt down to take the lower level disk IDs. And that was when he knocked the master power switch.

    The disks all dropped off line, red lights flashed on the consoles, but he very quickly restored the power. The active partition was trashed, but thats OK, because we could restore from tape and roll forward. But the pervious months partitiions were not so lucky. But hey, we had Sequents backup program. Simply restore the damaged data and we are good, right?

    WRONG! Most of our data was missing. It seems that Sequent does not backup a read only partition. We only kept about a months worths of backup because tapes were expensive, so when we restored and rolled forward we got tables spaces with loads of empty space. Major panic ensued because there was a regulatory reason for us having the data.

    Sequent\'s did look into this. They scanned and scoured our backup tapes and eventually came up with a solution. Without any form of a lie, their solution to the missing data was to hand me, formally, an insert to our manual set that stated the backup solution they had sold us "Did not support readonly tablespaces".

    And that was when we threw the Sequents out, ditdched Oracle and moved to DB2 on RS/6000 SP'2.

    1. Gene Cash Silver badge

      Re: SO much or Oracle dos not work as you expect

      > We only kept about a month's worth of backup because tapes were expensive

      I guess tapes don't seem quite so expensive now.

      I've seen people reuse tapes until you could see through them, despite warnings that they didn't have good backups. "What? We're not getting any errors in our backup process"

  12. Anonymous Coward
    Anonymous Coward

    Ah, the shortest time unit in IT: The OhNosecond!

  13. AustinTX
    WTF?

    Why do the reports 2 months after activity has ended?

    Why can't they just do reports monthly but just hold onto them to distribute 2 months later when they're expected?

    Or do they need 2 months to fudge data from previous months?

    1. herman Silver badge

      Re: Why do the reports 2 months after activity has ended?

      The 2 months are required for creative accounting.

    2. David Newall

      Re: Why do the reports 2 months after activity has ended?

      I think the idea was to temporarily replace the most recent two months of data with the most recent two years of data, so that end of month comparison became end of year comparison.

  14. Mike 137 Silver badge

    Fundamental misunderstanding

    "He hadn't been a DBA for very long so, in his words, he was not yet "all knowing or all powerful""

    Anyone (in any discipline) who thinks they're "all knowing or all powerful" isn't by definition. Someone who actually was as close as possible to either wouldn't think like that -- they'd (in the words of Dirty Harry) know their limitations.

  15. dr.k

    Larry - Striving to become consciously competent

    Larry was consciously incompetent, unlike the more dangerous unconsciously incompetent types. He also knew enough to invoke the power of the unconsciously competent DBAs when it was needed. No harm, no foul.

  16. DBA-ONE

    Hundreds of years ago I was a production Oracle and SQL Server DBA. The Oracle side required less work as that was a pending prod system. I spent the majority of my time doing data conversions from SQL Server and what not. I had a process down that would drop existing objects and then import data and recreate things like views, procedures, etc. It all worked very well and was efficient. Until one day. We had dev, test and prod instances of course and while I thought I was in the dev instance I ran a script that dropped all sequence objects. I ran the script as the Sys user in the Prod instance. Once that happens, you cannot login anymore and this means the instance had to be rebuilt, and it was. If that system were live it would have been a resume generating event. After that I modified glogin to show the instance and user and any time I was going to run a script I logged in to a new connection.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like