What we need is production databases that require 2FA or 2 user auth to run DELETE and DROP commands :p
Oh, sugar! Sysadmin accidently deletes production database while fixing a fault
An unfortunate sysadmin has deleted the production database of diagramming outfit Gliffy, ironically while attempting to fix a problem with backup systems. An unspecified “issue” was discovered by the company in its backup systems last Thursday. However, on working to resolve the problem, during the scheduled weekend …
COMMENTS
-
-
Wednesday 23rd March 2016 16:04 GMT Alister
What we need is production databases that require 2FA or 2 user auth to run DELETE and DROP commands :p
Or possibly Sysadmins who stop and check, and then check again, before deleting anything, ever.
My thought is that he restored a duff backup over the top of the live database, instead of creating a copy.
-
-
This post has been deleted by its author
-
-
Thursday 24th March 2016 12:56 GMT allthecoolshortnamesweretaken
- "What we need is production databases that require 2FA or 2 user auth to run DELETE and DROP commands :p"
- "Or possibly Sysadmins who stop and check, and then check again, before deleting anything, ever."
True. But you'll have to admit it - a sleek console with a two-key-thingy like in a missile silo would be totally cool.
-
-
Wednesday 23rd March 2016 16:16 GMT Nate Amsden
I was at a company once where we had JMS queues backed by an oracle DB (weblogic). Had queue issues outage 24h+. The end solution was to truncate the tables with the queue data. I still remember to this day backnin 2004 the oracle dba saying along the lines of "I'm not truncating shit until you get the company president on the conference call to approve". The VP wasn't enough. The president approved and things got back to normal until the next outage at which point we were more comfortable truncating those tables. So many bugs in weblogic jms back then (haven't used it since that job)
-
-
-
Wednesday 23rd March 2016 16:27 GMT zanshin
My team runs our company's incident/problem/change solution. (Yes, the irony burns.) A few years back, we had tables in a QA DB that we no longer needed. DB administration at that level is managed by a dedicated DBA team, not the application team. We sent them a request for the table drops and, knowing our prod and test DBs had nothing to do with one another, thought nothing more of it.
Except, unbeknownst to us, the tool the DBAs use to perform such tasks connected to both test and prod systems alike, and the over-eager person involved issued DROP CASCADE against the tables in question *everwhere*. In the middle of the US morning / EU afternoon.
The only reason this did not completely destroy our production OLTP DB was that there were locks in play because of our level of user concurrency. (Logs later showed that the DBA actually tried the deletes several times when they failed.) Our prod reporting DB instance had no such protection and critical tables were wiped out. Restoring that took a long time because the tables were huge and, at the time, the reporting DB schema was not a 1:1 match with the OLTP system. (You can do that with fancier replication tools.) The reporting instance had to be restored from remote backup, which literally took days. Fortunately, for the duration, we were able to point most of our BAU features that relied on the reporting instance to the OLTP instance instead, accepting the modest risk of OLTP performance impact to keep important things working.
Happily, this event did produce both process and architecture changes in the way the DBA support tools were used and set up. And, probably, at least one staffing change. o_O
-
-
Thursday 24th March 2016 10:17 GMT Anonymous Coward
Re: it's so easy
MySQL disallowing updates or deletes without a where clause is all very well...
But imagine trying to type:
DELETE FROM table WHERE ID >= 1234;
And mishitting the "equals" key to get the key on the left. Hey presto:
DELETE FROM table WHERE ID >- 1234;
That's a whole different clusterf*ck...
-
-
-
Thursday 24th March 2016 23:46 GMT JLV
Re: it's so easy
As a dev, not a dba, my approach when writing manual update commands for a live database that other folks will execute, tends to something like:
select *
-- delete
from important_table
where condition ...
The instructions ask you to first execute the whole command & check that the select returns plausible values for what will be nuked. Then you are asked to select just past the -- comment on the second line, exposing the delete proper to execution.
Slightly more complicated to do with updates.
But I've bookmarked the wrap-in-transaction suggestion. Which is not exclusive with mine.
-
Thursday 24th March 2016 12:23 GMT Alan Brown
Re: it's so easy
"It's taking a bit longer to restore because they've just discovered the backups have been broken for the last year."
In the case of a NEAX61M telephone exchange, the backups were fine, but what they were backing up was corrupted (but didn't get discovered until the system was rebooted after y2k updates had been loaded in)
Cue having to go to 2+ year old backups, then replay every transaction from that point up to date into the system.
On a live telephone exchange....
With 50,000 lines on it.....
In the days before mobiles were universal...
The replay took 7 weeks to complete.
The telco wasn't popular for some reason. I wonder why.
-
Thursday 24th March 2016 20:10 GMT Down not across
Re: it's so easy
It's taking a bit longer to restore because they've just discovered the backups have been broken for the last year.
Which is why, if the data is important at all, you do fairly frequent test restores.
Depending on your flavour of database it may even support some sort of validate without acually having to restore, in which case you do frequent validations (and still the occasional actual real restore to ensure it really can be restored).
-
Wednesday 23rd March 2016 16:18 GMT GlenP
Not quite the same but had a PC engineer kill a 486 SCO Unix server whilst trying to fix the external DAT backup drive.
I was on holiday, I told them not to touch it until I got back but they thought they knew better ("they" in this case being the parent company) so called in their tame maintenance company who'd clearly never worked on a Unix box before. Engineer waltzes in, decides he needs to shut the server down but doesn't have a clue how so just turns it off.
It took me about a week to get it fully back up and running with a mix of file recovery and restores from the last successful backup.
The parent company wondered why I declined to go and work for them.
-
-
Wednesday 23rd March 2016 17:48 GMT phuzz
Re: In a similar vein
Back when I was a young 'un, a friend had leant me a bunch of Amiga games along with a copy of
"White Lightning", a fast disk copier that would work on a lot of copy protected disks. (yes I was pirating them, I was young and knew no better).
My Amiga only had one disk drive, I think you can all guess how this one is going.
The worst part was that I realised halfway through copying some game over the copy program and tried to stop it, but too late.
I somehow tried to make out that it was all my friend's fault for leaving the copy tab open on the White Lightning disk.
-
Wednesday 23rd March 2016 18:05 GMT Andy Non
Re: In a similar vein
Holds hand up embarrassed. One Friday afternoon, a number of years ago, after pub-o-clock I did the twice daily backup of my employer's live production database on a DEC 8250 to a removable disk platter, but somehow fumbled the live database platter with the backup platter and overwrote the files on the live platter. ARGHHHHHH! After an hour or two of sheer horror I restored the live platter back to the close of the day before. Thankfully the files on the live removable platter were more or less static and it didn't appear subsequently that any data was missing or nobody complained of any missing data; which was a huge relief.
Lesson learned: After pub-o-clock, don't do any mission critical stuff on a computer!
-
-
Thursday 24th March 2016 00:24 GMT Crazy Operations Guy
Re: "Lesson learned: After pub-o-clock, don't do any mission critical stuff on a computer!"
I've learned that the hard way so I built a script to connect to the ticketing system to push any changes planned for later than 4 pm or on a Friday to 9 am the following day or Monday morning, as appropriate. I ran a 10,000+ machine dev/test Datacenter so no one actually did anything outside of work hours.
-
-
Wednesday 23rd March 2016 18:30 GMT David 132
Re: In a similar vein
I once Ghosted a blank drive over a client's hard drive instead of the other way around - oops
I did the same, about 10 years ago. I had a gold master HDD full of product demos that I'd spent days carefully assembling, and I had to Ghost it to a blank HDD.
I made exactly the same mistake as you, with two additional enhancements all of my own:
1) My boss was sitting next to me in the lab, watching his "star engineer" at work,
2) I had just said something like, "Now to copy this disk and we can both go home. Wouldn't it be funny if I got the source and destination disks mixed up! Ha, ha, ha!!"
As I recall, Boss just shook his head sadly and left me to it; he knew from experience that I am not a nice person to be around when under stress.
-
Thursday 24th March 2016 07:26 GMT John Robson
Re: In a similar vein
"I once Ghosted a blank drive over a client's hard drive instead of the other way around - oops."
I have seen a RAID controller do that automatically...
Mirrored disks, one fails - alarm goes off, everyone carries on.
Pull it out, all good.
Pop in a new disk, all good
Array starts churning, excellent - copying data from one to the other, tea time.
Erm, where are all the files?
Why do we have two disks with identical unformatted data?
-
Thursday 24th March 2016 07:33 GMT Benno
Re: In a similar vein
Back in the late '90's I had a staff member accidentally ran a ghost multicast onto an entire subnet! Thankfully WOL wasn't implemented - so a bunch didn't start, plus we managed to turn a heap of systems off before they ran the client (thankfully Pentium 233MMX's don't' boot that fast!)
It did take a few days to sort out the remaining carnage though!
-
-
Thursday 24th March 2016 15:32 GMT Skoorb
Re: In a similar vein
Back early on in the Windows XP to 7 migration we needed to test the automated deployment tool, so we could see how the SCCM would actually deploy the installer image.
So, a simple, near default, Windows 7 build with no software installed was created as a test by the project team to deploy to a couple of systems in a lab.
Unfortunately, it was deployed to the early adopter test group, where all the monthly desktop updates go to be tested before being pushed out organization wide. This is the majority of IT, including the helpdesk.
To get things moving along quickly, it was pushed out as mandatory, immediate and requiring a forced immediate installation including restart.
Thus, the entire helpdesk and most systems in IT were simultaneously trashed in the middle of the working day before someone managed to kill the deployment.
:-/
-
-
Wednesday 23rd March 2016 16:32 GMT Alien8n
Haven't done it in a production environment but have trashed my own web server a few times when the upgrade path wasn't being nice and decided a clean install was a safer bet. Did take a backup of one particular table prior to trashing it though as it contained the guestbook for my brother's memorial page (no way I was losing that data)
-
Wednesday 23rd March 2016 16:34 GMT Anonymous Coward
An ex colleague of mine...
... once accidentally deleted the whole CVS repository. The root partition had filled, and in attempt to recover space it obliterated the code repository - actually, the reason the root partition had filled was he incorrectly created the repository there.
Of course his backups weren't working.
-
Wednesday 23rd March 2016 18:36 GMT David 132
Re: An ex colleague of mine...
A friend of mine was responsible (circa 2003) for building drivers for a well-known brand of network card.
He wrote a cron job to compile the latest build overnight, then e-mail him the result log.
He made some elementary mistakes, and what actually happened was:
1) build process failed, repeatedly re-trying and re-failing,
2) his C: drive filled up with an ever-growing log file,
3) eventually, when the disk was full, his script progressed to the next stage: trying to email the log to him.
He came into work the next morning to find a very angry IT dept; he'd crashed the company's Exchange servers by trying to send a 2.1GB attachment.
-
-
Wednesday 23rd March 2016 16:36 GMT Anonymous Coward
Some people are just born lucky
I used to work in support for an IT supplier. Shortly after I left the support team I'd been hauled back in to run some training and got chatting with an ex colleague. He'd had a case a bit like this about a week or so before the training when he got a panicking call from a customer (from an ex institution somewhere in London) their DBA had been working on a problem and decided the best approach would be to dump the whole database out to some files, drop the tables and then reload it all. What could go wrong hey? The only problem was that this database of all their customers was quite big and so dumping it all to files and then reloading it was going to take time and the pubs were open. So he wrote a quick script to do it while he sods off to the pub.
It's really only 3 simple steps after all
Guess which of the 3 steps he got wrong?
It was shortly after this that they realised that the safe was full of useless tapes which didn't contain the database (why doesn't anyone ever check their backups?).
Queue panicking call to my mate in support on Monday morning.
Luckily for the guy in question, my mate had been out on site the week before to investigate some problems on the database and just happened to have a tape sitting on his desk with their entire database (less the last couple of days) on.
Me I'd have sent the tape to someone at the Bank of England, the directors might have had some explaining to do.
-
-
Wednesday 23rd March 2016 17:41 GMT waldo kitty
Re: Found Out The Rule The Hard Way
That rules says that any data that does not exist IN THREE SEPARATE PLACES
does not exist full stop.
Please define "THREE SEPARATE PLACES".
Izz'at three separate partitions on the same device?
Izz'at three separate partitions or devices in the same machine?
Izz'at three separate partitions or devices in two or more machines?
Izz'at three separate and distinct devices?
Izz'at three separate and distinct devices in three separate and distinct machines?
Izz'at three separate and distinct devices in three separate and distinct machines in three separate and distinct buildings?
Paris because everyone cries when they realize their horrible mistake could cost lives or possibly just millions of $$$...
-
Thursday 24th March 2016 09:30 GMT Dazed and Confused
Re: Found Out The Rule The Hard Way
> Please define "THREE SEPARATE PLACES".
There is a reason why the Veritas Volume Manager supports 32way mirroring. They had a customer ask for it. Said customer has 8 Data Center's (SIC) buried under 8 mountain ranges in 8 corners of their continent. At each data centre you need 4 copies, 2 for mirroring, a 3rd to split off for backup and the fourth so you can alternate the merge back, so there is always yesterdays data available too.
-
-
Wednesday 23rd March 2016 17:44 GMT Terry 6
Re: Found Out The Rule The Hard Way
Yes, and there should always be a copy( even if it's an extra one) of mission critical data that is never overwritten or replaced by the next oldest back-up until the latest back-up has been verified. (Years spent guarding a database of at-risk pupil records)
-
Thursday 24th March 2016 13:23 GMT Anonymous Coward
Re: Found Out The Rule The Hard Way
I don't care about the amount of places, I'll go one further:
BACKUPS ARE USELESS
What you really need are RESTORES. You can have a million backups, but if none of them restore you still have exactly zip. This is the lesson you learn when your backup medium is something that you cannot play back during your recovery, for instance if you've been using a new fancy tape device that you don't have a spare of.
As long as you have not proven that you have a RESTORE (i.e. tested it) you have in my book nada. You'll get at best a 7 for effort, and a 0 for continuity checking.
-
-
Wednesday 23rd March 2016 16:57 GMT Jason Bloomberg
Been there. Got the T-shirt
Back in the day when not all computers had an OS I once saved memory to the system sector of the disk rather than the user sector it should have gone to. An all-nighter ensued restoring that from punched tapes - yes, it really was back in the day.
I am sure I'm not the only one who has used MS-DOS 'COPY' to accidentally move a whole directory into a single file, then deleted that directory before realising. Or more simply deleted the wrong directory. I still occasionally get caught out by "COPY . .\BACKUP" where I don't press the keys hard enough so the first dot is missing and I overwrite the latest from the backup.
With the best will in the world we all do something stupid from time to time. But there's nothing better than being screamed at that things have to be fixed quickly or heads will roll to make things worse than they already were.
-
Wednesday 23rd March 2016 17:05 GMT cd / && rm -rf *
It's easy to take the piss...
... but having come very close to doing the same thing* one day, I can only say "but for the grace of $DEITY there go I".
* on a filesystem containing ten year's worth of scientific data. Yes, there were (proven) backups, but it would still have been highly embarrassing. Many a slip 'twixt brain and finger hovering over the Enter key**.
** ohnosecond, n: the shortest interval of time distinguishable by the human brain. It is thus named because it's exactly the length of time that elapses between hitting the Enter key and saying "Oh no!" (Thanks to Henry Law)
-
Wednesday 23rd March 2016 18:30 GMT Triggerfish
Re: It's easy to take the piss...
No way I'd class as a sys admin, but having worked with a few techy types, I came to the conclusion all decent sys admins, techs, engineers end up having a moment at some point where they have buggered up something for some daft reason. The decent ones are those who learn from it.
I have certainly had my late night alone in the office moment of ooooh fuck, lets phone a friend and see if he can help me keep my job by morning.
-
Wednesday 23rd March 2016 18:47 GMT werdsmith
Re: It's easy to take the piss...
Yes, we've all experienced that feeling when you realise the mistake and you get a kind of sinking feeling, followed by a hot flush and then mind in overdrive as whatever the brain equivalent of adrenalin kicks in.
But in my experience, 9.5 times out of 10 there is a way out, and you've just got to find it, and find it quick if you are in a pre-arranged downtime window (always ask for 5 X more minutes than you think you'll need).
My colleague who re-configured an ODBC DSN on a 64 bit windows system, checked it, double checked it, triple checked it, then ran an upgrade process through it...... with a 32 bit application...... knows that feeling so well.
-
Wednesday 23rd March 2016 19:26 GMT Anonymous Coward
Re: It's easy to take the piss...
Doing a little work on prod one day when someone asked me to restart the UAT DB, I log into the UAT DB, get distracted for a moment, go back and restart the DB.
After about 30 seconds the support guy turns around and goes "Any reason all the (workflow) engines have gone red"
"Oh dear" in my mind, *looks at the terminals*
"Yes, I just restarted the production database."
An apologetic email and talk with the head of the production floor, it was only about 3 minutes of downtime, but heh. Felt bad man.
This is why all my production databases are now in the dark green terminal with light green font.
-
-
-
Wednesday 23rd March 2016 17:10 GMT Anonymous Coward
Quit drinking !
"An unfortunate sysadmin has deleted the production database of diagramming outfit Gliffy, ironically while attempting to fix a problem with backup systems."
Well, as the subject says, bloke really needs to quit drinking ! When you're a storage admin, working on fixing any problem on backup systems, you need to be sharp, not fuzzy.
Geez.
-
Wednesday 23rd March 2016 17:13 GMT BigWomble
Been there done that.
Been there done that.
Playing with MySQL Master - Mater replication.
Dropping a table anywhere suddenly became bad news.
We lost a few days of posts on a support forum and got one or two confused customers as the backup I had to hand was older than I realised.
I'll stick with Master Slave for now. And keep better backups.
-
-
Wednesday 23rd March 2016 18:09 GMT Anonymous Coward
Re: I was asked once to resize a virtual machine image.
Poor design on Qemu's part. It should check if data is going to be lost as a result of the command and require a '-force' or something like that. At the very least be smart enough to know that no useful VM can be 160K in size.
Not that this excuses any admin from using a command he's obviously not familiar with without checking the man page first to be sure of what he's doing.
-
Thursday 24th March 2016 10:36 GMT Anonymous Coward
Re: I was asked once to resize a virtual machine image.
Poor design on Qemu's part. It should check if data is going to be lost as a result of the command and require a '-force' or something like that.
Indeed. However that might break scripts, another option would be to add a "grow" command that sanity-checks the input. I had read --help at the time, but was so used to moving between it, Ceph commands and OpenNebula, the latter two specify megabytes as the unit.
Luckily, the VM was not mission critical, and we had an old copy somewhere that we were able to press into service.
-
-
-
Wednesday 23rd March 2016 18:51 GMT R0man
one of my first contracts, came from desktop support, started to work with servers, other staff away for the day on some course.. A problem with the backup server.. hmm i think i know what it is . i just need this tool from AltaVista.. oh oooo.. malware knocked out the servers network, i didn't know enough to just clean it and get it back up.. Called the admin...good guy but didn't tell him what i did. said.. err you'll have to rebuild it.. Never built a server, didn't even know what smart start was.. got it all back up same day and the backups working.. fastest learning experience in my life.. got kudos for rebuilding the backup server in a day.. and no one was any the wiser that me downloading dodgy software was the cause.. . Now responsible admin.. I think we've all got one.. oh shit.. that was a f*ck up.. as long as it's just the one.. all good.
-
Wednesday 23rd March 2016 19:13 GMT Throatwarbler Mangrove
Bye, homies
One mistake made as a young sysadmin was, while trying to clear out some hidden directories in a user's home directory, running "rm -rf .*". Did you know that ".." falls into that wildcard? I sure found out quickly, once I realized that the rm was taking longer than expected and killed it. Only lost a few users' home directories and was able to recover them from NetApp snapshot, but it was definitely a brown trousers moment.
-
Wednesday 23rd March 2016 19:25 GMT Midnight
"Hey, guys. We really should test our restore procedures."
"Not now. We're busy."
"No, really. We need to test our restore procedures."
"You already said that. We have too much going on. Maybe we can put aside some time for it around September."
"You're not listening. We really, REALLY need to test the restore procedures. NOW."
"Why is that so important?"
"Because I just had a little accident with the production database. And it's kind of gone now."
-
Wednesday 23rd March 2016 19:31 GMT Tikimon
My so-called supervisor killed an alarm processing server
Working for a company that made wireless backup comms for fire and burglary alarm systems. My supervisor was a guy who thought having worked for a rocket company in the 60's made him smart. Our data lived in an SQL database which held all customer and traffic data for 24/7 alarm monitoring and call dispatching.
One day he's playing with queries in SQL, in spite of knowing almost nothing about it. He types "delete" to clear his query, so he thought. So SQL obligingly deleted the WHOLE DATABASE. When the system crashed, I saw him in the SQL console and knew instantly what had happened.
They had always refused to let me test the existing backup plan, citing the 24/7 thing. So the backups were useless. They had to spend three days rebuilding and reindexing the database from scratch, flying in two people from out of state, and paying an outside SQL expert massive overtime.
Brilliant.
-
Wednesday 23rd March 2016 19:59 GMT Anonymous Coward
Cisco ASA on older release a few years ago, sometimes after dicking about with VPNs and crypto, it gets a bit frustrated and usually removing crypto from an interface and reapplying it works a treat to clear the issue.
Was working on a customer's VPN one evening from home, had strange crypto issues so thought I'd remove crypto and start again . . . too tired to realise I was accessing the device via a VPN and shut myself and all VPN customers out until data centre staff could restart it for me.
-
Thursday 24th March 2016 21:00 GMT Down not across
That's why I like "reload in X", or "conf t revert timer X" (if on IOS 12.4 or later on a supported device (or better yet use the IOS' archive feature to archive configuration versions))
JunOS of course has "commit confirmed X"
Yes, of course I've locked myself out editing an ACL remotely. Once. Hence the above.
-
-
Wednesday 23rd March 2016 21:20 GMT Herby
Backups, what backups, Oh, that one...
Back in my PFY days (it was the 70's, forgive me), we were going to (eventually) upgrade the OS to the next version. A few months earlier we had ordered some more memory (expensive in its day) and I knew that the old (currently running OS) would upon seeing this new memory would use it, and since the new OS took more memory (but not as much as the add on we had just purchased) I decided to patch the current OS to limit its scan of memory and just use the fixed size of the existing memory. Fast forward to the installation of the memory, and I chime in "see, we need the new OS to use the added memory". I was hoping to get it installed because it had nice features I liked.
Well, the powers at be decided that it would take training to get users up to speed (not really, but OK), so the installation was delayed a while. A little while later while experimenting, I wiped out the old OS, and being a good guy, and not knowing where the backup system tape was (or if it existed), just loaded up the new OS. Posted a few instructions (actually pretty simple ones) and left for the day (it was late).
Surprise surprise, I come in the next day and tell what happened, and was asked why I hadn't used the backup tape (it was in a filing cabinet, safe keeping and all that). It was handed to me and I looked at its date, which was before I had
buggered uppatchedimproved the OS to only accept the old memory limits. OOOPS!!I reloaded the old OS and everyone wondered why they had more memory. I flustered a bit and said, well the new operating system was MUCH better!!
Eventually we did go to the new OS, but I did a whole lot of dancing that day.
Ah, youth.
-
Wednesday 23rd March 2016 23:14 GMT Mark Exclamation
I remember the time the senior IT person from our parent company used scripts to copy the registry from the AD primary server to all the secondary ones (so they all had exactly the same registry). Once he realised his mistake, his solution? - reboot them all! It was a Friday morning so our network was down all day, and it took us local IT people all weekend to fix it up. Expecting to get a "thank-you for working all weekend" from the IT manager of the parent company, all we got was a "Why did it take you so long to fix up our error?". And the person who made the error? - he's now a senior manager!
-
Wednesday 23rd March 2016 23:52 GMT Kepler
Ouch!
"ironically while attempting to fix a problem with backup systems."
Reminds me of the time in the Summer of 1985 when, while attempting to make a backup copy of a paper I'd already been working on 'round the clock for two full weeks, I accidentally destroyed my only copy and had to start the damn paper all over again.
My goof? Instead of a blank floppy diskette, as I intended, I inserted the disk that already contained my only copy of the paper into my Tandy 1000's B drive, and then typed "Format B:"!
Oy!
-
Wednesday 23rd March 2016 23:53 GMT Anonymous Coward
I think the worst I've managed....
Was emptying the company email unsubscribe list... Thankfully the mornings backup saved everything before the powers that be ever noticed....
And they still don't to this day. (I have since moved jobs but it still gives me squeeky bum time thinking about it....also makes me very wary of live data).
Annon to save the innocent and protect the guilty.
-
Thursday 24th March 2016 01:47 GMT Anonymous Coward
Just hit enter
So, the other day an email popped up from our endpoint protection server:
"Malware detected on one of the workstations in your environment...
bla, bla bla,
Malware Name: win32/TesCrypt
Computername: bla bla bla
MalwarePath: c:\users\blablabber
Action: Quarantine; succeeded"
Quarantined. Okay we're good... Wait a sec. Tescrypt?!? Security guy next to me almost has a stroke, calls the user and tells them to pull the power cord. "I know you're not supposed to. Please, do it now!"
Apparently the user had tried to click 'No' on the UAC prompt several times and finally put in a ticket cause it wouldn't go away. Helldesk promptly called back and advised to just click 'Yes'.
"... clickety. My credentials don't work."
"Let me try, it's probably just windows updates."
-
Thursday 24th March 2016 05:28 GMT John R. Macdonald
Things seen in a distant past
Reminds me of a gig I did as a contractor in the mid 1970's. Client company was running a small IBM mainframe (370/135?) with removable disks.
To make things 'easier' for everyone (operations and programming staff) TPTB decided the production and test disk packs would have the same volume serial numbers.
They did until the day the production disks were overwritten during a test run.
-
Thursday 24th March 2016 06:02 GMT Oengus
SCO Unix
Yes we used to run a SCO Unix server. We had an administrator who was responsible for the backups insisted in logging in as root.
One day on the production server he entered
rm -r *
and pressed enter (he was in the / folder).
About 10 minutes later he came out of the computer room and said that he was having a problem with the server.
When I went in and looked at the screen I started laughing and he couldn't work out why... I called a mate in and he laughed as well (we were not in any way responsible for this system). They learned how good their backups were and the administrator was quickly moved on.
-
Thursday 24th March 2016 08:11 GMT Picky
Users can do it as well
In the 80's I installed publishing systems for about 15 local papers. Each site was given a few boxes of 720k 3.5" floppies for backing up after each edition.
Then one day I noticed that to "speed" things up the Editors were inserting disk 1, then disk 2 then disk 1, then disk 2 etc - to save using disks (needed a box for a full backup)
-
Thursday 24th March 2016 09:24 GMT I Am Spartacus
VAXen - it was easy then too
I watched a guy who had just come back from a VAX/VMS system admin course trying to set up mirrored disk - what DEC called RAID-1.
Before I could stop him, he had mirrored the system disk with a blank disk, but had the blank disk as the master. We watched as the system slowly evaporated and crashed.
Ahh an evening with TU45's loading VMS again.
-
Thursday 24th March 2016 12:08 GMT Alien8n
Crashed macs
Due to an issue with one of our macs I had to completely wipe the hard drive of the mac and re-install.
Cue the issues with re-installing, for whatever reason it would not accept the apple id and password to re-install. Solution? Time Machine to a fresh portable drive of the new macbook I'd just built and then went over the old macbook with the new Time Machine image. Result! (and as an added bonus meant it didn't have to wait several hours downloading the company data onto the old machine).
-
Thursday 24th March 2016 13:10 GMT Anonymous Coward
test system and disk
Reminds me of the time I worked for a bank, that used to accumulate trades on one system and transfer them by 5.25 floppy to anotger system to be executed. Oh woe was i the day that I used the floppy full of test data. Ah well 600ml dollars worth of trades CAN be reversed but it aont easy. I felt my career prospects were blunted somewhat. Talk about dead man walking. Thonking about it now, not sure how much was my fault, but somebody had to carry the can.
-
Thursday 24th March 2016 13:31 GMT Anonymous Coward
It's pretty much a required mistake..
.. to get your sysadmin creds, because it lets you experience what an adrenaline rush is, followed by a feeling of dread and panic. When you move on to BOFH stage you learn to ensure you keep the rush and let the users get stuck with the dread and panic part, but I'm getting ahead of myself :).
Experience is something you get AFTER you need it, but in this case I think it ought to be part of anyone learning stage - better screw up something that will only get you a chewing out or mass derision for a while than not having the experience and do this in production.
Mistakes are always made. You can recognise the professional by her/his ability to plan for them, even if all looks well and functional.
-
This post has been deleted by its author
-
Thursday 24th March 2016 21:12 GMT Down not across
@1980s_coder
Are we supposed to laugh at this? Seems like an extreme case of simply not being up to the job or a company hiring idiots instead of professionals to save a few pennies.
Pathetic.
You never made a mistake in your life?
These things happen. They shouldn't., but they do. It's fairly safe bet he/she isn't likely to repeat that mistake any time soon.
Much more of an issue is how the company dealt with the incident and communicated it.
-
-
Friday 25th March 2016 16:08 GMT Stuart Castle
A few years back, we needed a new equipment logging and tracking system at work. The system needed to interface with the existing inventory system, and offer facilities for booking equipment between given dates and also tracking who has that equipment.
Myself and a couple of colleagues designed a system that would enable us to do this. It was a simple system. A user website that would enable users to book items of equipment. An admin website that (amongst other business admin functions that were nothing to do with equipment) enable us to print out the bookings, ban users from booking equipment and change it's status and a small utility that would enable us to scan equipment barcodes in and out, and perform stock takes. All of these used a custom designed SOAP service to access a database on SQL server. The justification for this is that with several mission critical systems accessing the database, my colleague who designed the database thought it a good idea to route all access through the service to prevent problems, and manage the accesses correctly. I suspect the real reason was he'd spent a lot of time researching SOAP and wanted a chance to put his research into practice.
A few months after the first version of the system was released, the backend SOAP service fell over and would not start. It took out all the attached websites and the utility, which cause massive problem as by that time, we relied on it.
My colleague investigated, and after a couple of hours found the fault. Someone had logged into SQL Server Manager, gone to the database holding the tables used by the system, and renamed the transaction table with a full stop.
Now, I have no idea why it took him two hours, but I suspect he couldn't believe someone would be so stupid as to directly access a production database, so checked everything else first. The annoying part (for me) is that there were three of us with access to the tables. I genuinely didn't do anything (partly because I wouldn't anyway, and partly because I didn't think I had access), but when our department head found out, no one would admit to doing it, so all three of us got a bollocking. The two of us who didn't need direct access to the database also lost our direct access (which is how it should have been).
Perhaps the ironic thing is that all three of us have Computer Science degrees, so should know not to make changes directly to production systems. For the record, I don't make changes directly to production systems. I always maintain a development and testing version of a system, then when the changes are properly tested, copy them over to the production system.
-
Thursday 31st March 2016 11:13 GMT Scaffa
I must admit I've done something shameful on a production environment before.
After the initial "you fucking idiot!" calls were out the way and the issue was fixed, I actually got thanked for owning up.
I got the trust back in the end, but I think people appreciate knowing that they can trust you to own up to a mistake - instead of hiding and praying it isn't discovered.