Re: "especially since nothing is broken."
Sure, you and I would call it broken, but Management sure as hell isn't going to agree. If money is coming into the company at or above, the normal rate means everything works.
An IT department is pulling its hair out this month after realizing a coworker who died last year was the only person who could log into a crucial network switch. This is according to Dylan, a sysadmin at a small US healthcare company, who today told El Reg a story of how he and his colleagues ended up locked out of the …
No problemo! I would put all this in writing to the management and wait patiently until it becomes really broken. Sometimes in the future they will have to approve the maintenance and it might be at an even worse moment than it is right now.
but Management sure as hell isn't going to agree
"Err boss - we appear to have had a sudden power outage in part of the data centre and the switch went down for a minute or so.."
"On the good side, I now have a full config of that switch and can log into it"
"No - the two are not connected - why do you ask?"
Pah. Rubbish.
It is always "mission critical" right until it isn't - When the neglected plant blows up, falls apart, seizes up, burns up, then the "business decision" is to wait weeks and weeks by holding daily crisis meetings while the replacements trundles trough procurement and everything goes to hell.
Since the beancounters cloaked themselves with power, it is always more important to "control costs" than it is to run the business.
Went to add a new Server to a Dell UPS which had already had the software installed and configured by the previous tech but he hadn't documented the Admin password to the software. Called Dell nonHelpdesk to see how I could reset the password to default but no way could it be done. Was told that even uninstalling the software may leave the changed password in the Server.
Not happy Jan!
... if you were to report the make (Dell, apparently), model and serial number (last 4 digits XXXXXed out) here, one of ElReg's esteemed readers would have an avenue of approach to get in. Might involve a soldering iron, but I've rarely been stymied when the chips are down and I have access to the hardware.
How is that only one person would have the password? All the configs should be backed up to a Rancid server and all the network engineers should have Radius/TACACS log in credentials. Several senior engineers or managers should be able to grant/restrict access to devices as required. The devices themselves should be restricted to having only a senior engineer or manager change critical passwords.
"How is that only one person would have the password?"
The real world isn't always as sensible a place as it should be. In an ideal world, you would use RADIUS/TACACS to manage the device and have a local username/password safely stored away somewhere in case this failed. Or you have a single username/password in a restricted password safe.... Or default credentials that no one ever bothers changing... Or username/password just one person knows... All options...
For recovery, it depends on the model as Dell has mixed and matched vendors across their switch range. With Cisco and some other vendors, you maybe able to use SNMP RW details to backup the config to a TFTP server. Or there may be a web GUI/API that gives you RO access to at least document the config. Even SNMP RO can provide a good chunk of information if you are trying to replicate a config, assuming you use SNMP management tools in your environment.
Otherwise you are stuck with figuring out how it works. It may not be easier than you expect, especially if there's some basic documentation to go along with it or you have access to connected switches to get shared information off.
In a former life as an IT manager with a large company, I was ensured key passwords were stored off-site in a secure fire safe. I made it a matter of policy that "anyone can get run over by a bus" so procedures had to be in place to handle such an event. Unfortunately I was subsequently hit with a real bus which landed me in hospital briefly and wrote off my motorbike. The irony wasn't lost on me.
Several years on, I was working as a contractor / software developer for my own business. I gave all relevant documents to a bookkeeper / accountant who did all the necessaries for submissions to Companies House, HMRC etc leaving me to get on with developing software. I didn't have much of a clue about book keeping and accounting. Then one day I got a letter out of the blue from a debt collection agency. I'd been fined for non-submission of annual accounts to Companies House. This was the first I'd heard, so I got straight onto the phone to my accountant... or rather spoke to his widow. He'd died unexpectedly a few months earlier from cancer. He was only in his thirties with a couple of young kids. His widow helped the best she could and between us we managed to piece together bits of information from random files, documents and spreadsheets on his computer so I could reconstruct my accounting position and make a late submission to companies house. It was a complete crash course in accounting for me.
It is too easy to have too many eggs in one basket. Anyone can die unexpectedly and potentially leave you in deep doo-doo without adequate procedures in place. It was somewhat ironic that in my former employment I put procedures in place to handle dead-employee scenarios, but when I ran my own business I fell foul to a lack of such procedures. Sigh.
"The irony wasn't lost on me."
I feel your pain! For years and years I used to tell the guys in the workshop to make sure that rope/cables didn't present a trip hazard. You can guess who then tripped over a bit of string they'd thought they'd tied up neatly, but hadn't. Cue several v expensive visits to the dentist :(
As a (very) long time network engineer, I can't accept this at face value. If the only switch they were locked out of was the core switch, then the bulk of the config could be extracted from all of the neighbors. Once you have the information from the neighbors, then some network events debugging enabled on the neighbors would enable the bulk of the remaining missing information to be derived.
Yeah... and we have only the new guy’s word that he was incompetent... if there’s one universal rule in tech it’s a new engineer slagging off his predecessor to management. You will never see lawyers, accountants etc do this. They understand that solidarity to the profession trumps any short term advantage they may gain from doing so.
"You will never see lawyers, accountants etc do this."
I assume that Patisserie Valerie's new accountants are not singing the praises of the previous lot.
"They understand that solidarity to the profession trumps any short term advantage they may gain from doing so."
'Solidarity to the profession' sounds quite a lot like 'close ranks and disavow any fault' to me.
'I assume that Patisserie Valerie's new accountants are not singing the praises of the previous lot.'
I don't see why not, ok, maybe not in public, but at least in private, as, after all, they've got a nicely paying gig out of their predecessors' sterling work, and as a bonus they've no doubt also got lots of doubleplusuncheap forensic work on the go as well just to keep the old expenses meter spinning that wee bit faster..
Lawyers, as officers fo the court, are required to report incompetent, fraudulent, crooked of dodgy fellow lawyers. They might have pride intheir profession, but I have known them to blow the whistle, somethimes descreetly, but never-the-less. OTOH I have seen IT colleagures closing ranks around a useless and dangerously-inept IT mate because simply went all dog-pack when one of them was threatened.
the ones integrated in the M100 chassis were easy to get into, pull the out, set/unset a jumper and plug them back in. The Juniper switches they resold can be broken from the serial console.
Also, they most certainly can determine exactly what this switch/router is doing by reverse engineering. You just have to map it all out based on where the connection go, and how those devices/ports are configured.
Been there, done that, still have the t-shirt...
Indeed, buy a new switch and rack mount it, have a good guess (and by guess I mean work out as best you can), what the problematic switch has to be doing, configure the new switch accordingly, swop all the connections over. Then it's a case of seeing who screams about loss of connectivity and adjust config accordingly.
Management will absolutely hate this plan, just calmly point out that it's either this approach (where there is at least the possibility of backing out), or wait until the old switch catastrophically fails, at which point people start losing their jobs, possibly starting with the management!
And afterwards you'll have a nice new (well actually old) switch to be factory reset and used for something else.
... wait until the old switch catastrophically fails, at which point people start losing their jobs, possibly starting with the management!.
Your argumentation, while factual, is not aligned with the usual incentives structure: Getting sacked only pops the Golden Parachutes for Management earlier than the planned and people losing their jobs will always boost stock prices, which will boost Management stock options even more, so it is a good thing.
So? Worry about what you control: Write an ass-covering report/action-plan, present it at the management meeting, it will be rejected, and then leave it at that. Nobody cares. If you care, consider a different career.
Dell has nothing to do here. If I remember correctly there was a sysadmin in a Californian city who accepted to go to jail instead of providing access to Cisco switches* he was managing.
* see "service no password-recovery" command on Cisco switches
This is why one should Radius/TACACS for authentication. You can provide various levels of control to your engineers. There are a variety of commands that engineers doing their daily work in a network do not need to have access and all of which can be controlled, at least using decent equipment.
Even if you use a password manager, you still have to be in control of at least one of the three parts, and this depends how you configure access. You either need to be an administrator of the password manager, administrator of the directory service, or administrator of your 2fa system.
If you manage the password manager, you can reassign resources to other users.
If you manage the directory service you can modify group membership to grant access to other users.
If you manage the 2fa system, you can reset/re-issue 2fa for accounts.
These are questions for your password manager vendor.
The things I recommend for companies that use password managers are, ensure that all interaction gets logged, and never allow a single individual to responsible for anything. This means having a second individual with admin permissions, even if they would only be capable of doing anything while on the phone with tech support.
I use Password Manager Pro from ManageEngine.com. It runs on Windows or Linux and you can install a copy that is the full enterprise version for a trial. I have used the free version and the Enterprise version, and while the Enterprise version has some very nice features, my company's needs are such that the free version works just fine.
I have a friend who's company uses LastPass, and they're happy with it.
That being said, I think you should try several password managers and determine which works best for your situation.
We've been going through this sort of thing with our succession planning for our parish council (2.5 employees!). The God passwords are written down and stored in a safe, the key to which are held (locked up) by the two Responsible Officers and the (external) IT Auditor. If all three manage to die at the same time - especially something that takes the IT Auditor as well - it will probably be due to an event causing more concerns than parish council continuity. ;)