Repetitive work on multiple servers
I was working on a 16-node Novell Cluster, updating drivers. A process that had been done many times and non invasive and with no loss of service so deemed by management as safe to do in working hours.
The process was simple, take a node out of the cluster - “CLUSTER LEAVE”, copy the new device drivers and then reboot that node - “SERVER DOWN”, wait for it to start up and re-join the cluster, and move onto the next node.
By about the 14th or 15th node, after typing the same commands time after time, instead of typing “CLUSTER LEAVE” to take the node out of the cluster, I typed “CLUSTER DOWN”.
It should be noted that Novell does NOT ask “Are you sure?” when you type such a command, and it does what the command suggests it does – instantly. All users, potentially some 4,500 of them suddenly lost their file shares, email, printing, internet access – the works.
My only saving grace was it was late afternoon on a Friday so there was not that many users actually affected.