back to article Linux admin hated downtime so much he schlepped a live UPS during office move

The working week can be burdensome, so each Friday morning The Register tries to lighten the load by bringing you a new instalment of On Call, the reader-contributed column in which you let go of tech support stories that weigh on your memory. This week, meet a reader we'll Regomize as "Bobby" who told us about an old friend …

  1. simonlb Silver badge
    WTF?

    Smart, But Also Bloody Stupid

    I can understand why he wanted to maintain uptime on the mail server, so he really did think going down this route was a cunning plan. However, if anything had happened to that server or UPS whilst being moved between sites he would almost certainly have lost the server either through physical damage and/or data loss, so the plan was totally stupid and he should have just not done it. Powering on a server that has been shut down correctly and then moved is always a tense moment, but putting yourself in the position of potentially having to replace hardware or even the server, or rebuilding the server and restoring from backup (you did make one, didn't you?) with all that hassle and impact to the business just so you can say, "Look, the mail server has been up for xxx days now." is completely unwarranted and foolhardy in the extreme. This is not someone I'd like to employ as I couldn't trust them to do things properly.

    1. Korev Silver badge
      Coat

      Re: Smart, But Also Bloody Stupid

      > This is not someone I'd like to employ as I couldn't trust them to do things properly.

      You mean he should be an Exim employee?

      1. Jou (Mxyzptlk) Silver badge
        Coat

        Re: Smart, But Also Bloody Stupid

        I sendmail you my script collection, 'cause only those could do real mail 100% controlled the way I want to. I'm gonna exim the building now.

        1. Korev Silver badge
          Coat

          Re: Smart, But Also Bloody Stupid

          What a funny Exchange

          1. Anonymous Coward
            Anonymous Coward

            Re: Smart, But Also Bloody Stupid

            Peanut was a complete Yahoo

            1. Korev Silver badge
              Coat

              Re: Smart, But Also Bloody Stupid

              He had a good Outlook though

            2. Korev Silver badge
              Coat

              Re: Smart, But Also Bloody Stupid

              Luckily the server stayed up so he didn't need to Postfix it

            3. Anonymous Coward Silver badge
              Coat

              Re: Smart, But Also Bloody Stupid

              Lugging that heavy UPS around, I expect he was a hotmail

              1. Elongated Muskrat Silver badge

                Re: Smart, But Also Bloody Stupid

                He'll be pine-ing for his uptime now it's gone

                1. KittenHuffer Silver badge

                  Re: Smart, But Also Bloody Stupid

                  Was the server a Norwegian Blue by any chance?

          2. Outski

            Re: Smart, But Also Bloody Stupid

            The rest of the estate could've fallen like Dominos

    2. Jou (Mxyzptlk) Silver badge

      Re: Smart, But Also Bloody Stupid

      So you never ever had a dumb idea to reach a goal only you cared about, whereas most others (at best) giggled? Or in other words: You were never young?

      1. Anonymous Coward
        Anonymous Coward

        Re: Smart, But Also Bloody Stupid

        He said he'd never want to employ Peanut. That implies you are replying to management. So your first question is redundant, except for the giggling part. And Peanut acted out his own dumb idea, instead of delegating it, which is incomprehensible to some.

    3. blu3b3rry Silver badge
      FAIL

      Re: Smart, But Also Bloody Stupid

      Admittedly not with a server, but I've come across engineers moving their CAD desktops in a similar manner as "I can't shut it down, I'll lose all my stuff".

      The conveyance of choice for the UPS and full size desktop tower was of course the humble office chair. Worked fine until someone managed to knock their PC off their chair onto the floor, unluckily being one of the few still running a hard drive instead of an SSD for storage.

      The drop also snapped the keyboard and mouse USB plugs (he had left everything connected apart from the monitor) off in the motherboard ports and yanked the UPS power cord out of the back.

      Unsurprisingly the PC wouldn't complete POST or boot when plugged back in. As I recall the engineer didn't get given a replacement, but a spare motherboard and HDD were found and stuffed in the now scuffed and dented case as a reminder to be more careful.

      1. Roland6 Silver badge

        Re: Smart, But Also Bloody Stupid

        >CAD Monitor…

        Depending on when this happened, I would not want to be the one delegated to carry the monitor; people who think a 28-inch flat screen is heavy, has either forgotten or never attempted to lift a 19-inch CRT.

        1. skswales

          Re: Smart, But Also Bloody Stupid

          We bought a 24" Sony CRT for a partially-sighted employee. That weighed a bit.

          1. Roland6 Silver badge

            Re: Smart, But Also Bloody Stupid

            Correct eye height being obtained by placing on top of desktop system unit, which bent under the weight… :)

            1. John Robson Silver badge

              Re: Smart, But Also Bloody Stupid

              Remember when computer cases were made of proper sheet steel, not the doubled up foil that passes nowadays?

              1. GlenP Silver badge

                Re: Smart, But Also Bloody Stupid

                One drawing office I worked with had ACT Sirius workstations*, once retired from CAD work they made useful steps!

                *The ones with the 10MB (that's not mistyped) musical hard drives.- they varied the rotational speed as the heads moved to increase the data storage, similar to the slightly later RLL (Run Length Limited) drives.

                1. PRR Silver badge

                  Re: Smart, But Also Bloody Stupid

                  > varied the rotational speed ....similar to ... RLL (Run Length Limited) drives.

                  RLL didn't vary the speed. You may be thinking of zoned bit recording (which varies the time per bit, not the speed).

              2. Guido Esperanto

                Re: Smart, But Also Bloody Stupid

                And had edges like a Gilette razor blade.

                In the circles of pc enthusiasts I was in, it became known colloquially as the "blood sacrifice"

                1. Noram

                  Re: Smart, But Also Bloody Stupid

                  The good old days when your PC repair kit included floppies/CD's of drivers, some dongles, an assortment of different screws, spare realtech NIC (10/100 if you were fancy), screw drivers, and a large packet of sticky plasters and some antiseptic cream.

                  One of the reasons I started recommending more expensive cases to people, and only using Antec etc in my builds back in the early 00's was because I was fed up of leaving a treat for Khorne or any local vampires.

                  1. ManInThe Bar

                    Re: Smart, But Also Bloody Stupid

                    And two terminators (thick ethernet, not Arnie)

              3. Anonymous Coward
                Anonymous Coward

                Re: Smart, But Also Bloody Stupid

                I remember cutting a nice piece of thick steel out of one to make a tab washer, to fix the washing machine. It lasted another year

          2. Jou (Mxyzptlk) Silver badge

            Re: Smart, But Also Bloody Stupid

            The obligatory link to Sony KX-45ED1, aka the PVM-4300, with 43" visible area (45" tube size) the largest CRT TV ever made, > 200 kg. And it shows an advantage these things still have: Upper left corner below 1 ms delay, lower right corner only depending on refresh rate. Could digest much higher resolutions and refresh rates than the usual 768x576 (PAL).

        2. Alan Brown Silver badge

          Re: Smart, But Also Bloody Stupid

          There are 19inch CRTs and there are 19inch CRTs

          I had a couple of fixed frequency ones on my desk which were a 2 person lift. They easily weighed in excess of 40kg apiece and I can only guess how much steel/lead shielding was inside them.

          1. Guido Esperanto

            Re: Smart, But Also Bloody Stupid

            I had some IIYAMA behemoth that was about 21"..

            I'm glad some clever sparks invented lcd screens

            1. Rob Daglish

              Re: Smart, But Also Bloody Stupid

              Yeah. We looked after a series of devices in libraries in the early 2000s that had either a 24 or 27 inch touchscreen Iiyama. They were absolutely immense monitors, in both senses of the word...

              My personal favourite though was a 32" Philips CRT which was very, very nice. It had been bought for a Novell Netware 4 server. I never officially liberated it for my own PC, I just worked in the server room a lot. (I was 17, I didn't know any better!)

              1. Gene Cash Silver badge

                Re: Smart, But Also Bloody Stupid

                > Philips CRT which was very, very nice. It had been bought for a Novell Netware 4 server

                Why the hell would you have a huge monitor on a server? Am I missing something?

                1. I could be a dog really Silver badge

                  Re: Smart, But Also Bloody Stupid

                  Perhaps it was a sly method to get a good monitor for other uses ?

            2. The Travelling Dangleberries

              Re: Smart, But Also Bloody Stupid

              I used to used to take a Brompton on trains to help me get to and from work.

              The vents on the top of my CRT monitor at work were a great place to dry off gloves and woolly hats on rainy days.

              I gave away the 21" Iiyama monitor I used at home when I left The Netherlands and regretted not taking it with me for some time afterwards.

        3. Noram

          Re: Smart, But Also Bloody Stupid

          I have memories of a Sony courier arriving to swap out my 21" monitor under warranty, and his comment that it was only just under the weight limit for it to have been a two man crew, I think he was quite relieved to find that not only was I happy to assist him move the transit case, but had a trolley on hand so carrying was minimal (I used to go to lan's so was prepared for shifting it).

          I also remember the fun of having to reinforce my desk to take that monitor having seen how it had started to cause a noticeable bow in the desktop's built in "monitor shelf".

          I do not miss those old beasts.

        4. NorthIowan

          Re: Smart, But Also Bloody Stupid

          I finally recycled my two 19" monitors to our local Staples a year or two ago. I remember when I brought the one in, the "kid" working that day asked what kind of computer it was.

      2. Flightmode

        Re: Smart, But Also Bloody Stupid

        I had a colleague like that in a previous job. We had to do a few office moves over the course of a couple of years, and he refused to let movers handle his (apparently extremely fragile and priceless) bog standard Dell office PC between sites, so he always stayed late in the afternoon (dude never showed up before lunch anyway) on the eve of the move and packed up his own PC into his yellow Cinquecento and drove it across town to the new office and connected it up himself. Every time he would get a stern talking to by management saying that by doing it himself, neither him nor his car nor his PC were covered by the company's insurance and that he would have to pay for any damages himself. (He didn't care.)

        I'm sure I've told the story here about the aftermath of one of those moves? He'd gotten his stuff wired up and zip tied all neat and pretty on his desk (let's just say that his Cinquecento was another story) and then someone discovered that our new desks were electronically adjustable. He too wanted to see how high his desk could get (he was a big guy), and as he pressed the Up button, all his neatly tied zip tied cables caused his monitor to end up in pieces on the floor. I believe the company replaced that one, though, as it technically happened in the office.

    4. l8gravely

      Re: Smart, But Also Bloody Stupid

      I just shutdown a VM with 1800+ days of uptime. And yes, it was up way too long and should have been patched, etc. But the end user customer doesn't like paying for support, doesn't like reboots because they tend to fire/let go/reassign the contractors/outside vendors who built the damn app, etc.

      Of course the same idiots don't mind testing UPS to generator failover during the day without any notice....

    5. kmorwath

      "I can understand why he wanted to maintain uptime on the mail server,"

      That's the very issue, and why you shoud select growns-up as system administators. A gray beard is not enough, too many got a gray beard well before growning up.

      As the Fool in Shakespeare says “Thou shouldst not have been old till thou hadst been wise.”

      1. Outski

        Re: "I can understand why he wanted to maintain uptime on the mail server,"

        Grey

        1. Doctor Syntax Silver badge

          Re: "I can understand why he wanted to maintain uptime on the mail server,"

          It's a colour that can be spelled either way

          1. J.G.Harston Silver badge

            Re: "I can understand why he wanted to maintain uptime on the mail server,"

            One is a colour, the other is a name.

          2. ttlanhil

            Re: "I can understand why he wanted to maintain uptime on the mail server,"

            grEy if you're speaking English

            grAy if you're speaking American

            Apart from when it's a name, they're (in-practice) interchangeable, but if you want to spell the colour correctly, that's the hint as to which is which

    6. J.G.Harston Silver badge

      Re: Smart, But Also Bloody Stupid

      I was fully expecting the story to go: and as the trolley hit a pebble in the car park, the entire stack of equipment went flying into the asphalt.

  2. Korev Silver badge
    Coat

    This story's nuts!

  3. Jou (Mxyzptlk) Silver badge

    Those were the innocent days...

    ...when admins thought that Linux does not need to be updated and rebooted. When security by obscurity still worked. I was among them, long long time ago, in a galaxy far away.

    1. Phil O'Sophical Silver badge
      Facepalm

      Re: Those were the innocent days...

      Yes 400 days uptime probably means 400 days without a kernel patch. Security? Who needs that on a mail server...

      1. Jou (Mxyzptlk) Silver badge

        Re: Those were the innocent days...

        Hold it right there: You don't need to destroy your uptime to update the mail server. You stop the mail daemon/service/pause-reload-crontab/killall -9/whatever, update it, and start it again. Your other stuff, except for the port needed for mail, are not exposed, as every admin who uses an UPS during a move to keep uptime high does it. (In reality: You SHOULDN'T need to reboot, but it does not always work out that way)

        1. Anonymous Coward Silver badge
          Facepalm

          Re: Those were the innocent days...

          The kernel handles the networking side, so if there's a vulnerability in socket handling or similar, it's not the mail daemon that needs updating. Kernel update basically requires a reboot.

          1. cyberdemon Silver badge
            Devil

            Re: Those were the innocent days...

            There's always `kexec`, if you really want to replace a running kernel without interrupting your uptime..

          2. Jou (Mxyzptlk) Silver badge

            Re: Those were the innocent days...

            I wrote "as every admin who uses an UPS during a move" - how many more irony signals are needed today? Why did anyone think I meant that post cereal?

            1. PRR Silver badge

              Re: Those were the innocent days...

              > how many more irony signals are needed today?

              Why are you even arguing with Anonymous Coward? There's way too many ACs lately.

          3. theOtherJT Silver badge

            Re: Those were the innocent days...

            ksplice dude.

        2. Aida

          Re: Those were the innocent days...

          You don't need to, but having it boot as a part of your regular patching, you know it'll get back up more readily, and if it isn't you're prepared to fix why it didn't come back up, rather than having to pick up the broken pieces when it one day falls over and you find out that *something* broke the install and you now have to fix it unexpectedly

      2. Joe W Silver badge

        Re: Those were the innocent days...

        Kernel patch? This is not windows ;)

        (yeah, there's a bunch of CVE for the Linux kernel, not sure about how severe they are. Just thinking about the frequency of new kernels my linux machines get it doesn't seem too bad, I agree, 400 days is probably stretching things)

        1. Alan Brown Silver badge

          Re: Those were the innocent days...

          Linux has had hot patching for a while...

          1. collinsl Silver badge
        2. Anonymous Coward
          Anonymous Coward

          Re: Those were the innocent days...

          >>> Kernel patch? This is not windows ;)

          >>> yeah, there's a bunch of CVE for the Linux kernel, not sure about how severe they are.

          I wouldn't be so proud of the lack of kernel vulns. Here's 50 pages of Linux kernel vulnerabilities for the current year. Multiple high level vulnerabilities.

          https://www.cvedetails.com/vulnerability-list/vendor_id-33/product_id-47/Linux-Linux-Kernel.html?page=1&year=2025&order=3

          High uptime was something I recall Linux enthusiasts were always boasting in the 90s. I can understand 400 days if the server is on a 24/7 production line doing something critical, but keeping your clients of servers unpatched is just madness these days.

          1. Outski

            Re: Those were the innocent days...

            I had a Domino 5 server on 460ish days on NT4, many years back, it only got rebooted because, after realising it was out of warranty, the client realised some maintenance and patching might be in order. Otherwise, the thing would just run and run (think Mo Farah, not Pheidippades)

      3. Roland6 Silver badge

        Re: Those were the innocent days...

        What! Are you suggesting MS were ahead of the curve when they shipped W95/98 with an effective maximum up time of 49.7 days…

        1. WolfFan Silver badge

          Re: Those were the innocent days...

          You rebooted every 30 days or less with Win95/8/8SE. Usually, less, as something would crash and force a reboot. Often several reboots in a week. Sometimes several reboots in a day.

          In those days I had a Mac at home and my work box was a VAX cluster... but the department had to try to keep a bunch of DOS/Win3.11/Win9x systems running for the hoi poloi. The Urge To Nuke Redmond was... hard to resist.

          Uptime. We loves it, my precious.

    2. DS999 Silver badge

      Not only patches

      But many exploits are not permanent, and go away if you've rebooted - especially if your OS partitions are read only. Now obviously if it isn't patched it the same hole can be re-exploited after the reboot but that isn't automatic (because if it is then you have a simple way to trigger the exploit making it all to easy to learn how it works so it can be permanently fixed)

      1. Roland6 Silver badge

        Re: Not only patches

        Especially those on appliances.

        For one client we implemented an out-of-band power switch, which could be externally triggered to power cycle their branch routers and switches (and thus PoE APs). Useful whilst we waited for the OEM to release new firmware, to fix units becoming unresponsive and routers corrupting routing tables after prolonged up time.

    3. chivo243 Silver badge
      Headmaster

      Re: Those were the innocent days...

      But it's offline, does that still count as uptime?

      1. Jou (Mxyzptlk) Silver badge

        Re: Those were the innocent days...

        Yes. Your mail server (exim/postix/exim/whatever) is still running, that is within your power, which is then again actual uptime. "Reachable" is not within your power, since the internet is, as for most people in the world, not in your power. And if you have strictly separated responsibilities (Hello 1944 CIA Field Sabotage Manual) the LAN is not in your power as well.

  4. Korev Silver badge
    Coat

    > This week, meet a reader we'll Regomize[sic] as "Bobby"

    Isn't Bobby an expert on database tables?

    1. excession

      Nah, that’s Bobby’s mum!

  5. Korev Silver badge
    Pirate

    Well, someone had to do it

    Obligatory

    1. that one in the corner Silver badge

      Re: Well, someone had to do it

      Blast, beaten to it; this is what I get for switching off the 07:00 alarm.

    2. Bebu sa Ware Silver badge
      Devil

      Re: Well, someone had to do it

      Yup. Be afraid… very afraid !

      As for Hell no fury… scorned women aren't in the race.

    3. C R Mudgeon Silver badge

      Re: Well, someone had to do it

      That cartoon reminds me of an incident at university.

      There was a skunkworks Unix system that was adminned mostly by students[1]. On graduation day one year, I saw one of said student admins leaving the machine room, still wearing his ceremonial gown[2], with his proud parents in tow.

      "Showing your folks around?" I observed.

      "Yeah," he said. "Unix was down; I had to reboot it."

      Given the state of Unix at the time [3], I presume he'd kept his folks waiting a while as he typed arcane commands on the system console.

      [1] The PDP 11/45 it ran on was left over from an old research project.

      [2] At least, so I recall, but I admit that the gown might be a false memory.

      [3] This was back in the pre-BSD, pre-fsck days when file-system damage was almost guaranteed after a crash, and fixing it required manual surgery using the dcheck, ncheck, icheck, and clri commands.

  6. Giles C Silver badge

    Seen it before.

    Routers with hundreds of days of uptime, but these days there are so many patches and fixed for machines that most of the network rarely gets more than 6 months before a reboot.

    As part of our Sox compliance we need to be on venders gold image or gold minus 1 with the intent of going to gold within three months of an image being released.

    However if you have a dual supervisor switch this sort of uptime record is possible even with software patching as the chassis will stay up just the processor modules will swap over the control. Some people may argue this is downtime but if it stays up then I don’t.

    1. Jou (Mxyzptlk) Silver badge

      Re: Seen it before.

      That is the old method. Today the cheaper way is to cluster two or more servers, and on top you get: One box can always be subjected to physical issues.

    2. tip pc Silver badge

      Re: Seen it before.

      logged into a 6500 the other day that'd been up 18 years

      out of interest, how do you deal with vendors that ship & constantly update multiple trains of code?

      like cisco 10.3.x/10.4.x, do you go 10.4 even though you don't need the features?

      1. Giles C Silver badge

        Re: Seen it before.

        Wow 10.3 code most of the stuff nowadays is running 17.9 or newer.

        We have a rolling site plan where every 13 weeks we start again, with 30+ sites (I forget the exact number as we are opening and then closing office as we expand) it is the only way to keep up.

        There is a judgement call on if we should patch on the same train or more to a new train but it depends on features required…

        1. tip pc Silver badge

          Re: Seen it before.

          9k’s in nxos looks to be 10.6(1) as most recent train

          https://www.cisco.com/c/en/us/support/switches/nexus-9000-series-switches/products-release-notes-list.html

          But you can see even 9.3(16) got a recent update too.

          Likely something common in both trains got fixed given the dates.

    3. SteveK

      Re: Seen it before.

      I once had a commodity desktop PC repurposed as a firewall/router which had clocked up a runtime in excess of 3000 days before a prolonged power outage caused the UPS to run out of battery and reboot it (to be honest I was surprised the UPS still held a charge after that long..)

      [It was a 1998 vintage 350MHz Pentium-II powered Dell Optiplex with 320Mb RAM and a 6Gb HDD, originally running Windows 95, repurposed to run OpenBSD, with no exposed services and connected to an internal network, just running internal DHCP and packet filtering for two door controllers and an environmental sensor on a legacy bit of network - it had originally done a lot more, but the rest of the building had been demolished.

      I kept it running purely to see how long I could until I finally decommissioned it in 2022 after the power outage...]

      1. GlenP Silver badge

        Re: Seen it before.

        We had a PC running in a machine control cabinet continuously for about 7 years, nothing fancy just a bog standard mini-tower, until one day someone hit the shut-down option on the menu (I never did find out why). Unfortunately the HDD decided not to restart* and although not strictly an IT department problem (I hadn't even known about it) I tried all my tricks but nothing.

        Me: "Backups?"

        Users: "What are they?"

        Me: "Software installation disks?"

        Users: "Errm, well we had them somewhere!"

        When the software disk finally arrived from the US (single floppy but with some odd DRM so it had to be a physical disk not a download) and I found a spare HDD we got it going again for a while until I could be bothered to replace it with a newer PC.

        *Not that unusual with machines that hadn't been shut down in a long time.

        1. pirxhh

          Re: Seen it before.

          Yeah, most often it was (is?) stiction.

          Had a case like this when a friend's server did not come up after a blackout.

          I let the shower run for a few minutes, took the HDD to the bathroom, opened it holding my breath, turned the platter a bit and put the lid back on.

          It worked long enough to recover the data.

          I put my foot down when said friend wanted to reuse the disk, though.

          1. C R Mudgeon Silver badge

            Re: Seen it before.

            What was the point of running the shower? That's the only part I don't get.

            1. Dave@Home

              Re: Seen it before.

              gets dust out of the air

    4. Excused Boots Silver badge

      Re: Seen it before.

      "Some people may argue this is downtime but if it stays up then I don’t.”

      Quite right, if the ‘device’ stays up and is usable by the staff then it’s up!

      Similar to a firewall cluster in a failover setup, update one, reboot, allow to stabilise, update the other, reboot. As far as the users are concerned the connection to the internet has stayed up.

  7. Maximus Decimus Meridius
    FAIL

    Insame

    Presumably, this machine would have spinning rust disks. They moved a machine, a server no less, while the disks were still spinning? How far? To an adjacent building or using a car/van?

    Insanity.

    1. JT_3K

      Re: Insame

      I was all set to spark off about how much spinning disks love to be jolted, especially with the kind of forces for rack mounting or the weird angles whilst being carried. Then I remembered that it's Linux and right at the back of my head something glowed - HDPARM allows you to park the heads of a disk. *Arguably* you could keep the uptime counter running with the OS in memory by parking the disks and pausing/stopping a load of services. Still not sure I'd do it mind.

      1. Dwarf Silver badge

        Re: Insame

        The problem with that approach is that if any disk based operations fires off, then the first thing it will do is to un-park the heads so that it can perform the operation.

        So, this would only work if the discs were completely inactive and that is virtually impossible for a running OS.

        You would also assume that if its a proper server, it would have an array of discs in it.

    2. Excused Boots Silver badge
      Mushroom

      Re: Insame

      Usually its an office chair, over cobblestones.

    3. tip pc Silver badge

      Re: Insame

      The hard drive in my 2011 car’s navigation system is quite happy on the uk’s bumpy roads.

      Spinny hdd in my iPod never had an issue either.

      Yes hdd’s where renowned for being fragile in the early days but g sensors etc made them far more tolerant to physical external bumps & movement

  8. wolfetone Silver badge

    I'd feel safe if Bobby and Peanut were next to me if I was dying and could only have my life sustained by a life support machine connected to a UPS.

    I would draw the line at Peanut interacting with the machine though.

  9. Lazlo Woodbine Silver badge

    Back in the mid 00's I was a pre-sales tech support at a security equipment distributor; my realm being the nascent world of IP CCTV.

    One of my customers was working on a system with a hefty non-disclosure agreement attached, so I couldn't know the end user, or the details of how the system would be installed or used; I only had the basic details of the number of cameras, recording rates etc. From this I had to spec the type of cameras, recording servers, storage, managed PoE switches and a UPS to keep the system alive for 6 hours.

    I queried the UPS, suggesting a smaller UPS and generator, but the end user had insisted on using a bank of UPSes.

    The system was duly specced, installed and commissioned by my customer and I heard nothing else for 6 months until I got a support call on Monday morning. The customer had experienced a long powercut over the weekend, all was good, the recording servers had run for the full 5 hours the power was down, but they hadn't recorded anything.

    My forst though was the cameras, which were all indoors and this not fitted with infra-red illuminators, had recorded, but you couldn't see anything because the lights were out.

    No, the guy says, the site had emergency lighting, so there should be footage.

    As his customer was important to him, my customer paid to fly me to site; which turned out to be a large casino, gentleman's club and a slightly more exclusive gentleman's establishment in Berlin.

    I checked the servers, and indeed they'd functioned all weekend, losing connection to the cameras the moment the power to the site dropped, so I checked the core switch, which had also stayed up, but had lost connection to all the PoE switches the moment the power cut.

    It seems the installer had only connected equipment in the security suite to the UPS, so all the edge switches dropped with the power, and all the cameras dropped, as they were drawing power from the switches.

    I pointed to a paragraph in my response to the tender, were I emphasised the importance of all switches being connected to the UPS, grabbed my coat and hopped on the next flight home...

    1. John Brown (no body) Silver badge

      "I pointed to a paragraph in my response to the tender, were I emphasised the importance of all switches being connected to the UPS, grabbed my coat and hopped on the next flight home..."

      What? No free overnight "accommodation" for your trouble? :-)

      1. Lazlo Woodbine Silver badge

        It really wasn't my kind of place - think deep red walls and furniture with attached restraints, and no cameras on those floors, at least none on that particular CCTV system...

  10. Bebu sa Ware Silver badge
    Coat

    Mea culpa !

    I have put a UPS on a lab trolley and moved a mail server with redundant power supplies—replace one mains lead with UPS lead then the other; then trundle the server to its new location; replug the mains leads. The network is down for the duration of the transfer (a few minutes) but was on the same vlan.

    One half of the building had power from one substation and the other half from another substation. Also two large network rooms. When one substation transformer had to be upgraded/replaced at short notice both the power and networking would lost for a day. So relocating the mail server to the other side of the building made sense as the amount of email piling up on the backup MXers would seriously thrash the server The startup was also pretty slow so not shutting the server down was also desirable.

    The part of this circus I found funniest was riding with the "mail server" down two floors in the lift (elevator.)

    If the server had a wifi interface I could have kept the network up too. Next time. ;)

    1. Alan Brown Silver badge

      Re: Mea culpa !

      Alternatively, you could create and use a hibernation partition.

      They're not just for laptops, although it's usually faster to cold start a server than to resume it.

    2. cyberdemon Silver badge
      Devil

      Re: Mea culpa !

      I feel bad for the spinning disks in your (and Peanut's) mail servers

      Desktop and server drives were not designed to be lugged about when running!

      Also, pretty sure the UPS is not meant to run without an Earth connection. It's (probably) still safe from an electrocution point of view, but it is unable to suppress any RFI from its MOSFET bridge

    3. I could be a dog really Silver badge
      Facepalm

      Re: Mea culpa !

      No "very long extension lead" ? Leave the server in place, take the power to it.

      In the distant past, I've had an extension lead down the corridor to a portable genny outside - kept the servers going so the remote sites could still work during a power cut. Worked fine, until I heard the fans running down in the UPS, and a clack as it went back onto battery, then another clack and the fans run up, rinse and repeat. Went to investigate - the landlord (we were in a converted cow shed, farmer did better from us than from farming) was also plugged into the genny using an angle grinder !

  11. An_Old_Dog Silver badge

    Checkpoints

    "This is my current simulation. It's been running for almost three weeks now; I expect it should complete and give me my results in a few days. Thank God I've got an uninterruptable power supply for this thing."

    <neeeeeeeeeeeeeeeeeeeeeeeeee> (building fire alarm)

    "$@#&+/%%¢£¥∆§π!! Quick, help me pack this thing up!"

    "The elevators are automatically locked out during a fire alarm. You want me to help you take this thing down the stairs? ..."

    I worked on a mainframe OS which had a thing called "checkpoints". You invoked them from within your program. If something bad happened, you could resume your program from the latest checkpoint (or an earlier one, if you preferred) and not lose all of the computer's previous work on your program.

    1. Paul Kinsler

      Re: Checkpoints

      I've manually coded in checkpointing at various times over my career for long-running simulations -- "long-running" for me being a week or more; but I've done some that took months to run :-/

  12. Tim99 Silver badge
    Linux

    Ok

    I have a headless Raspberry Pi 5 Lite TV recorder, thinned out so it only has what is needed. It is on a UPS and regularly goes for >200 days uptime. I do admit to a slight feeling of disappointment when I reboot it.

    1. Jou (Mxyzptlk) Silver badge

      Re: Ok

      Sounds like the UPS itself would draw more current than the equipment. Both in therm of "generally" and when it actually needs to kick in.

      1. Tim99 Silver badge

        Re: Ok

        Possibly: the TV head receiver, WiFi modem router, and a Pi backup server are also on the UPS. Running them all, the UPS runs for >150 minutes, more than long enough to record a program.

  13. Dwarf Silver badge

    Monkeys

    Well, they say if you pay Peanuts, you get Monkeys.

    I guess that would be helpful when moving a heavy UPS though.

    1. Outski

      Re: Monkeys

      Most monkeys, not that big. Apes on the other hand, different proposition

      1. John Brown (no body) Silver badge

        Re: Monkeys

        You just need more monkeys. Preferably flying ones in case there's any rough ground to cover.

        1. Outski

          Re: Monkeys

          Those wings hurt when they were forced out of the monkeys' backs.

      2. midgepad Bronze badge

        Cladistically ...

        (primates(monkeys((apes(Us))))

        (There are a couple more divisions, but that's enough brackets. I hope.)

  14. Prst. V.Jeltz Silver badge
    Windows

    Uptime shmuptime

    I do not have Peanut's "fanatical appetite for unbroken uptime."

    I reboot a bunch of SQL servers that make up a Data Warehouse weekly , because they enjoy it , Its like a weekly treat for them

    I know I shouldn't anthropomorphise the I.T. hardware though ... they hate that !

  15. Calum Morrison

    Clever but stupid

    This entire stunt - unnecessarily endangering the company email server - was pulled by someone desperate to show Linux was better than Windows in terms of uptime, wasn't it?

  16. phuzz Silver badge
    Thumb Up

    Uptime is a measure of how long it's been since you last successfully booted.

    Although this story really reminded me of these folks who transported a live server, and it's UPS, across Hamburg, together with a mobile 3G link to keep it online. On public transport, in the rain, just to make it more fun.

    Oh, and the server only had one power connection, so they soldered a second power connection to the board while it was powered on.

    1. John Brown (no body) Silver badge

      Wow. Did they pre-run the journey to make sure they had a mobile data connection the entire route, were just very lucky, or is it expected that all subway stations and trains have phone signal repeaters in Germany?

      1. Jou (Mxyzptlk) Silver badge

        3G has better reception below ground than all before and all follow up. The first 'cause it had more frequencies to use, the latter 'cause when speed got more important some lower frequency bands were kicked which work better on the S-Bahn. But there is no log in the video whether the 3G connection was actually always online and the server always reachable throughout. Thy asking them!

  17. K555 Bronze badge
    Alert

    Ducks for cover

    Going by the tone of the comments here, I'm gonna be judged.

    I work for a customer that have no dedicated space for their IT kit and also love to have ideas on rearranging their office. Depending on who's doing the rearranging, they'll always elect to move the rack to the room they don't personally care about. So it gets relocated on a semi regular basis.

    I've worked out that if I take the doors off their hinges, you can wheel the rack through them so it doesn't need to be stripped down and rebuilt every time.

    I've also worked out I can wheel it across the entire ground floor within the runtime of the UPS....

    1. Roland6 Silver badge

      Re: Ducks for cover

      Solved that issue - server on desktop and thus moveable, for one not-for-profit customer: they asked for wiring update and new fibre Internet access. Sold them the need for a larger wiring cabinet than the 6U cabling cabinets they were already using. They chose a location for the new cabinet… Asked around and a kind local business donated a pair of full height server cabinets… server only moved out of the cabinet when they moved to 365…

  18. jake Silver badge

    There is uptime ...

    ,,, and then there is uptime.

    My mail system, Usenet and attendant shell accounts has been up nonstop since Flag Day (Jan 1st, 1983). It would have been a couple years longer, but I decided to reboot the entire kludge when we switched the 'net over to TCP/IP.

    The longest any of the individual computers have currently been up is about 6 months.

    Note I said "system" ... it's multi-homed, multi-OS, multi-hardware, multi MTA (and etc). ... redundancy is fitted in everywhere I can fit it. It started as a thesis platform when I was at Uni (three locations: at SAIL, under Bryant Street in Palo Alto, and at the proto MAEWest), and now is spread out on six continents.

    Over-kill for a home system? Absolutely. But as a research platform, she's mostly tax deductible. It scales well, and parts of the concept are in place at several Fortune 500s. They should see similar uptimes for the bits they use, barring the almost inevitable catastrophic human maliciousness ... and even then, systems are in place to minimize that kind of damage. Maintenance at this stage of the game is on the order of minutes per month, and that's mostly just scanning the logs for anomalies.

    With that said, anybody who maintains personal uptimes[0] just for the sake of bragging probably deserves what they get. If your system's security/performance/whatever would benefit from a reboot, then reboot the fucking thing already! That particular DSW was over a couple decades ago, and BSD clearly won by a nose, with a properly setup Linux machine coming in a close second, followed by Apple trailing by a couple lengths. Redmond, sadly, DNFed and needed to be put down ... but for some reason was spared by the fanbois. Perhaps it'll be put out to pasture soon, it certainly won't be offered up for stud ...

    [0] Corporate uptimes are a whole 'nuther kettle of worms.

    1. Paul Herber Silver badge

      Re: There is uptime ...

      "6 months" ?

      I've heard about a computer somewhere with a really dull name that had a planned up-time of seven and a half million years!

      1. Roland6 Silver badge

        Re: There is uptime ...

        My understanding is that it somehow managed to recover and keep going a couple of times after events that should have resulted in a blue screen.

        1. that one in the corner Silver badge

          Re: There is uptime ...

          There was that one event, but it was more of a blue-green...

          Turquoise Of Dooooom!

        2. K.o.R

          Re: There is uptime ...

          Final stat was 9,999,999 years 364d days 23 hours 55 minutes.

          Bloody Vogons.

  19. Richard Tobin

    Another approach

    I had to move a Mac which was being used as a server so I put it into hibernate. It woke up at the new location and carried on as if nothing had happened.

  20. The Oncoming Scorn Silver badge
    Facepalm

    Still Annoyed With Myself

    I left my main PC on at home, to remote access into during a two week trip to the UK.

    First full day in London I inadvertently shut it down.

    New project when I get home fit a Alexa controlled power switch & set BIOS to reboot after a power outage or fit a wifi activated switch to the motherboard.

    1. Maximus Decimus Meridius

      Re: Still Annoyed With Myself

      Get a KVM switch instead? The Glinet Comet can be hooked up to a PCI card that can control the power or a fingerbot that has a small 'finger' that can touch the power button.

      Currently 20% off at Amazon.

      I don't have one, but reviews look good.

  21. xyz123 Silver badge

    Like Viagra warnings: if your Microsoft box stays "up" for more than 4hours, something is wrong and you should seek technical advice :)

  22. ColinPa Silver badge

    High availability

    I worked on IBM's mainframe systems, and a customer came to present about their systems.

    He said "we have provided an uninterpreted service to our customers, front end and back end, for over 3 years"

    We gave a smattering of applause.

    He said "Wait ... I haven't given you the good story. We've done this, upgrading all of the software twice, and moved machine room twice"

    That got a good round if applause.

    He explained they moved some machines from one machine room to another, brought them up. The front end routers then directed work to these systems. The database could be morphed over in a similar way.

    We were well impressed

  23. JimmyPage Silver badge
    FAIL

    So what was his plan had something gone wrong ?

    I have a nagging feeling there wasn't one.

  24. Doctor Syntax Silver badge

    A client had a Unix system uptime running into years. Not for bragging rights but because they'd heard rumours of the disk drives not reliably restarting and it had a lot of drives (single digit Gb sized drives. There was a lot of worrying when it had to be moved - just to the other end of the building. Given that every drive in the database was double duplicated it shouldn't really have been so much of a worry but it caused considerable angst in the lead-up to the move - which went without a hitch.

  25. Port207

    Frogger

    I'm reminded of the Seinfeld episode where George is trying to move a Frogger arcade machine without unplugging it to preserve his high score.

    1. spuck

      Re: Frogger

      https://www.youtube.com/watch?v=5etwHVarNgI

  26. chivo243 Silver badge
    Go

    Peanut?

    I knew a guy with that nickname, and he could also lift a small vehicle. I wouldn't trust him to sharpen a pencil...

  27. HxBro
    Linux

    It was a sad day to see this one disappear:

    20:48:04 up 1527 days

  28. Nate Amsden Silver badge

    i did this too

    About 24 years ago. Personal sever that had high uptime don't recall how high. I either drove it from home to office or vice versa. Office was maybe 5min from home. Had it on a small cyberpower ups. I don't remember much other than at one point in the office parking lot i drove over a curb and messed up my alignment a bit(old crappy 1989 car). But it probably wasn't more than 497 days as back then I think still had the kernel rollover bug for uptime.

    These days I have personal servers that have been up over 4 years. A few servers at work a few weeks away from hitting 6 years, at least a few switches at over 7 years of uptime. In all cases software updates either aren't being released anymore or nothing worthwhile to update to.

    1. Anonymous Coward
      Anonymous Coward

      Re: i did this too

      you had a cyber power that didn't go bang the minute there was a power cut? Literally, there were scorch marks inside, happened twice

  29. dmesg Bronze badge

    Hmm. A friend, when in the Navy, used to work on electronics and computer systems on board submarines. One of the ones that was even more, ahh, technical, than most of the sub fleet. He told me of swapping out /main memory/ on one of the systems while it was running.

  30. billdehaan

    Tempting, but dangerous

    In 1992, I was migrating a server room to a different floor in the same building. The company was expanding, so they'd rented the full 10th floor, and were abandoning the half of the 3rd floor they had occupied for about six years.

    The new server room was built, the power, A/C, HVAC, and halon systems were all approved, tested, and good to go, so now it was just a matter of moving 150+ machines. They were mostly Sun IPCs and IPX machines, but there were a few HP machines, and even an SGI box or two. All the networking and power connections were set up, so all that should be required for each machine was doing a proper shutdown, unplug all the wiring, throw it in the elevator, go up 7 floors to the new server room, find the properly labelled bench, plug everything in, turn it on, and everything would be hunky dory.

    It was booked for a long weekend, all users were informed, so all user workstations were shut down, there were no open files or pending requests on the servers, etc. Everyone left on Friday night, and Tuesday morning, they'd go to their new office, where everything would be the same.

    We had a fireman's line with one group disconnecting machines and loading them on a cart, a second team moving the cart to the new floor and unloading the machines, and the third team putting the machines in their new location and setting them up. In theory, it should only take a few hours to move all machines.

    We all know the difference between theory and practice.

    As it turned out, nearly 40% of the machines failed to turn on. We tried bringing them back down to the old floor, but no luck. So, we powered down some of the machines, and tried to power them back up again on the same floor, and the 40% number held. Almost half the machines wouldn't power back up after a shutdown.

    Fortunately, all of the machines that were DOA were Suns, so we only had to deal with one vendor. Even more fortunate, it was a big customer with lots of money, so it had enough clout to get a Sun tech person on site within an hour (3am on a Saturday morning).

    The culprits were the power supplies. Having never been power cycled for half a decade, many had faults occur that were never detected. Well, they would be detected on the next power cycle, but because the admins were competing to see whose subnet of machines could have the highest total uptime, most of which were over 2,000 days at this point, they went to extraordinary lengths to never reboot. Patches were hotswapped in, and other voodoo was performed to keep the uptime going.

    Fortunately, it wasn't the hard disk, and the machines were off the rack configuration, so they could pop the disks out of a unit with a bad power supply into a new machine and get it up and running quickly. All that we had to do was get about 80 new Sun workstations at 4am, have them delivered, swap out hard drives, and that was it. Easy peasy.

    The Sun rep saw a multi-million dollar account being lost on his watch, so he moved heaven and earth, and actually got us the machine in under four hours. I think he aged a decade in those four hours. We were amazed, and speculation was that he had pictures of Scott McNealy with a sheep, or something. But he managed to pull it off.

    Some security configurations that used MAC addresses had to be modified, but other than that, it was surprisingly uneventful.

    Our "should only take a few hours" migration that was supposed to be complete by Saturday noon, leaving Sunday and Monday for testing ended up taking until 9pm Sunday. We got everything working, but we were glad that we're migrated the small server room of 150 machines, not the major data center with over 2,000 machines.

    Before doing that undertaking, a new rule was instituted that all servers would, on rotation, be cold booted every 60 days, specifically to avoid a repeat. And that data center with 2,000+ machines? It's admins didn't have uptime competitions, so failure rates were under 40%, but they were still around 5%.

    Multi-year uptimes are great, but make sure you've got redundancy, and factor in the possibility that you may have to take everything off line at the same time at some point.

  31. Anonymous Coward
    Anonymous Coward

    All it takes is an industrial grade handcart with an UPS on the bottom shelf and a server with dual power supplies. If you're only moving the machine across the room, an ethernet switch and 100m cable will allow you to move the machine without it going offline for more than a second. If you're moving it from one end of the building to the other (and those rooms are 600m apart) don't suggest running that 100m cable down the aisle to the next wiring closet from the cart just to keep the machine online. Your co-workers will never look at you the same again.

  32. jockmcthingiemibobb

    I'm surprised his CLI login to the mail server was maintained whilst its network interface was down during the migration to the other building.

    1. Jou (Mxyzptlk) Silver badge

      Maybe he tried a keyboard, mouse and monitor locally?

      1. TheWeetabix

        I interpret that to mean that he was later logged into several machines, probably checking that they had survived the move, and rebooted the wrong one.

  33. Benegesserict Cumbersomberbatch Silver badge
    Coat

    For his next trick

    He tried to keep the England Cricket team's uptime greater than two days.

  34. IGotOut Silver badge

    Pffffttt...

    400 days? Bloody amateur computer users. Us (ex) telecoms guys laugh in your face.

    Try a Nortel Meridian Option 81 with over 5 1/2 years, and that includes removal of melted cards after a lightning strike on the lines. Or an option 61 that was shut down after four years, only because we moved offices. Heck even the baby 11s only got a reboot when the power died for so long that the UPS's ran out.

    As for Baystack routers and switches, you NEVER rebooted them unless you fancied a complete rebuild and restore.

  35. Lee D Silver badge

    I once did something similar.

    We were required to move the servers from one electrical circuit to another while some work was performed to bring in a new electric line into the server room (ironically, to prevent downtime).

    As this would mean significant downtime while the work was completed, and as I didn't want to have to deal with the fallout from users and even the bosses that had ordered this new electrical line, I came up with a better solution.

    I got a long extension lead, and a long patch cable.

    Due to the magic that is LACP and redundant power supplies, I was able to plug in the server from quite a distance into a distant room, then remove the old cables one by one, then move the server, then plug the extra cables back in in the new location. Rinse and repeat.

    I was aided in this by the fact that the cabinet was a very nice IBM-branded wheeled cabinet (the servers weighed an absolute ton), the server a very nice and highly redundant IBM blade server (4 PSUs, and it could run on just one), that the rest of the power was up (and I understood cross-phasing but was very careful and with redundant power supplies and a UPS in between one of them, it wasn't an issue - the servers basically only see the resulting DC current), and that all switches were LACP capable and the server had been configured to LACP all ports when first purchased (foresight for the win).

    Once the electrical work was complete, we reversed the procedure and put it back where it was supposed to be, zero downtime.

  36. PM.

    My personal record is 587 days uptime for proxmox server in my house's basement

    1. Lee D Silver badge

      Mate, I have seen laptops with a higher uptime.

      My personal (not professional) record is 1500 days (server obviously not public-facing at all).

  37. PM.

    Yeah , I know, I know. Then someone will chime in saying they have VAX running in their closet since 1997 non-stop ;-)

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon