back to article That time a JPL engineer almost killed a Mars Rover before it left Earth

Engineer and astro-futurist Chris Lewicki wrote an essay this week - that would not be out of place in our Who, Me? archives - about a testing mistake that nearly transformed half a billion dollars worth of Mars Rover into spacecraft scrap. Lewicki posted the tale to his personal site, describing work on the Spirit Mars Rover …

  1. Jason Bloomberg Silver badge
    Flame

    Measure twice, cut once.

    Throwing the switch is always a stressful moment even when I have checked, double checked, triple checked, that it's not all going to end in a puff of smoke, have had others run their own checks.

    He was lucky it didn't become another reminder of why it's never a good idea to have people working long hours, doing double shifts, being put under intense pressure to meet deadlines.

    1. Ian Johnston Silver badge

      Re: Measure twice, cut once.

      Throwing the switch is always a stressful moment even when I have checked, double checked, triple checked, that it's not all going to end in a puff of smoke, have had others run their own checks.

      Which is why it is incumbent on others to stand quietly behind the switch thrower and make a loud "Bzzzzzzrt" noise at the psychological moment, then hand them some toilet paper.

      1. HMcG

        Re: Measure twice, cut once.

        A single hand-clap behind their head at the precise moment they throw the switch generally results in much fun as well.

        1. Dafyd Colquhoun

          Re: Measure twice, cut once.

          Rolling a trolley over bubble wrap sounds like significant arcing too. Was accidental the first time, but after seeing the reaction it wasn't the last :-)

        2. Doctor Syntax Silver badge

          Re: Measure twice, cut once.

          An ex-colleague's preferred weapon was a steel ruler slapped down on the bench.

      2. jake Silver badge

        Re: Measure twice, cut once.

        Some of us expect it ... since time immemorial, the command I use to turn on new equipment has been "smoke it". For repaired equipment it is always "smoke test".

        Fortunately the magic smoke usually stays put ...

    2. Joe Gurman

      Re: Measure twice, cut once.

      Or even just before lunch, when some people's minds stray....elsewhere. For an example, see: https://space.stackexchange.com/questions/1783/was-the-noaa-n-prime-satellite-really-dropped-on-the-floor .

    3. jake Silver badge

      Re: Measure twice, cut once.

      "Measure twice, cut once."

      Unless you're doing 3-D work, like timber framing.

      Measure lots and then cut minimally.

      1. Doctor Syntax Silver badge

        Re: Measure twice, cut once.

        And have some spares, just in case.

        1. Benegesserict Cumbersomberbatch Silver badge

          Re: Measure twice, cut once.

          Isn't that the first rule of government contracting? Why build one when you can have two at twice the price?

          1. khjohansen
            Coat

            Re: Measure twice, cut once.

            - two at *Triple* the price - TFTFY

            1. Not Yb Bronze badge

              Re: Measure twice, cut once.

              And the ability to buy a new one twenty years later if necessary.

            2. Snowy Silver badge
              Joke

              Re: Measure twice, cut once.

              Only triple the price!, your being rather generous after all this is government work! Why not just add a another zero to the end the maths is so much simpler ;)

  2. Jou (Mxyzptlk) Silver badge

    Indeed a good "Who Me?"

    The complexity of those missions is so beyond what most, including me, can even imagine. I know it is science, but there is a point where it gets magic, no matter how much you know...

    1. Pascal Monett Silver badge

      Re: Indeed a good "Who Me?"

      Agreed.

      Having to check that 10,000 pins are correctly connected, down to the last one ?

      Lord preserve me from that.

      1. I ain't Spartacus Gold badge

        Re: Indeed a good "Who Me?"

        "9,952."

        "9,953."

        "Hi John, you fancy a coffee?"

        "No thanks. Now where was I? Um was it 9,593 or 9953? Oh bugger!"

        "1"

        "2"

        "3"

        "Hi John, you coming for dinner this evening?"

        "AAAAAAARRRRRGGGGGHHHHH!!!!!!!"

        ---------------------

        "And that, ladies and gentelmen of the jury, is why you should find my client not guilty."

        1. jake Silver badge

          Re: Indeed a good "Who Me?"

          For anyone who doesn't know, one doesn't test the wires consecutively, one tests them connector by connector.

          It is exacting, but not difficult.

      2. JWLong Silver badge

        Re: Indeed a good "Who Me?"

        Having to check that 10,000 pins are correctly connected, down to the last one ?

        I learn'd how to wire wrap in the 70's. We ohm'd out every trace. The largest assembly I did was 3K connections.

  3. Alan J. Wylie

    Main B Bus undervolt

    Oh no, not again!

    1. usbac

      Re: Main B Bus undervolt

      Apollo 13 is one of my all-time favorite movies. It prompted me to read Jim Lovell's book "Lost Moon".

      However, every time I hear the misquoted line in the movie, it just makes me grit my teeth!!

      1. Doctor Syntax Silver badge

        Re: Main B Bus undervolt

        A manager from years back said one of his former managers had a set of phrases to describe different grades of oopsies. The top grade was "Houston, we have a problem".

        1. Not Yb Bronze badge

          Re: Main B Bus undervolt

          I prefer "Obviously a major malfunction"

          1. Someone Else Silver badge

            Re: Main B Bus undervolt

            I like, "Malfunction, Stephanie!"

      2. ChrisC Silver badge

        Re: Main B Bus undervolt

        And comparing the film dialogue to the actual mission control recordings can be a jarring exercise, given how the dialogue switches from being a verbatim copy of what was actually said, to then going off onto a flight of the scriptwriters fancy, before returning right on cue to the real world again. Mind you, what's even more of a jarring realisation from the recordings is realising just how much of a combined team effort it was across all of the mission control teams, whilst anyone familiar only with the film adaptation of the story would be entirely forgiven for thinking the entire mission was handled by Gene Krantz and his team.

  4. This post has been deleted by its author

  5. Boris the Cockroach Silver badge
    Facepalm

    Now thats

    what you call an 'oh no' second.

    Think I'd have quit and hid in a different state

  6. Joey Potato
    Coat

    The loss of telemetry was not a random event ...

    ... it was no Fluke.

    1. jake Silver badge

      Re: The loss of telemetry was not a random event ...

      Hard to say as his results weren't very Klein.

      1. Yet Another Anonymous coward Silver badge

        Re: The loss of telemetry was not a random event ...

        I'll AVO go as well

  7. Anonymous Coward
    Anonymous Coward

    >"The monitoring multimeter I disconnected was actually completing the circuit that powered the spacecraft's ground test telemetry...."

    The multimeter was almost certainly measuring current, not voltage (as alluded to higher up in the article), if disconnecting it broke continuity.

    Stuff like that is why I prefer bespoke test harnesses over breakout cables. Oh, and labels, lots of labels. I've worked with too many people who count on remembering "this red wire is +12V, *that* red wire is 3.3V".

    1. Anonymous Coward
      Anonymous Coward

      The red wire is 12v, the scarlet one is 5v and the crisom one is 3.3v.

      Clear?

      1. usbac

        Or, when you get a piece of equipment where the entire color coding scheme was "Red"

        1. Yet Another Anonymous coward Silver badge

          I think there is one safety critical standard that all the wires must be the same colour. Idea is that it requires you to check each connection individually rather than assuming somebody else wired it correctly and that RED=12V

          1. JWLong Silver badge

            I think there is one safety critical standard that all the wires must be the same colour.

            Generally on commercial or industrial equipment if all wires are the same color then each wire is individually numbered with either a tag(s) or a printed number in the insulation.

      2. Anonymous Coward Silver badge
        Boffin

        The clear one is the earth lead.

        Not applicable to the Mars rover ;-)

      3. mirachu Bronze badge

        Any maroons?

  8. Julz

    Just

    Why. Why was the test rig bespoke and not part of the normal processes and built with the rest of the trundle bot?

    1. jake Silver badge

      Re: Just

      One word: Weight.

      Seriously, why ship a human-oriented test rig all the way to Mars when there will be no humans around to interact with it for (realistically) several decades or more?

      Besides, you then have the problem of testing the test rig ...

    2. My other car WAS an IAV Stryker

      Re: Just

      It was designed specifically for that one-off custom-designed rover and likely built at the same time, but still "bespoke" since test equipment from a different rover certainly wouldn't work!

    3. Richard 12 Silver badge

      Re: Just

      Everything about a space exploration mission is bespoke.

      Until recently that included the massive rocket.

      1. Missing Semicolon Silver badge

        Re: Everything about a space exploration mission is bespoke.

        ... built by the lowest bidder.

    4. Not Yb Bronze badge

      Re: Just

      Pretty much every part of that "trundle bot" was bespoke. Might have some OTS components here and there, but NASA space-rated stuff is deliberately over-engineered. Partly for durability and uniqueness reasons, partly because NASA is a research institution, and partly because "we have a budget we must spend this year or we get less next year" like almost every other bureaucracy everywhere.

  9. Antron Argaiv Silver badge

    Who among us has not done something similar in our careers? I know I have, and at least once have broken something expensive.

    From that experience I have developed some rules:

    - Work with a buddy...they may spot something you've missed, or ask a good question.

    - DO NOT work when tired. Take a break or go home and sleep.

    - Obsessive labelling is not a sin

    - Do respect others' test setups and give no quarter if they do not respect yours.

    - Never assume, always check. Always consult the documentation before trusting your memory

    - Take copious notes. Include diagrams.

    - Take pictures before taking it apart (my iPhone photo stream has equal amounts of work and personal images)

    1. Anonymous Coward
      Anonymous Coward

      Pulling an all nighter doing a migration.

      It's 3am and I have an hour before the step finishes so I might as well go and upgrade that raid box I've been meaning to fix.

      Type command, see red lights flash, then realize it counts slots from 1 not zero and so drive 1 was the source not the destination.

      1. 42656e4d203239 Silver badge

        Been there, got the T-Shirt.... replaced all drives in one disk pack (of 2). Rubbish UI on the controller persuaded both self and oppo, who was checking every move becasue I knew there was ample opportunity for cockup, that we had selected the right array. Hit initialize, yes we were sure, really sure, really really sure.... until we weren't. Looked at the drive activity lights and realized the inevitable.

        Went home, forgot all about it till Tuesday, came in early, initialized the correct drives and restored from backup - everyone was happy (aside from the day's down time for files that were sent to the great bit bucket in the sky).

      2. phuzz Silver badge

        Better than HP. In their gen8 (or maybe 9) servers, the disk caddies had a large red light on them, which if you looked closely had a big exclamation mark on it. This light indicates "Do Not Remove", however, if you don't know this, and are sent to swap the failed disk in a server with a mirror of two drives, you might well assume that the big red light is on the failed drive.

        I think in the end I only caused about half an hour of downtime, but I still blame HP.

        1. Not Yb Bronze badge

          Printer ink subscriptions are still worse.

        2. Yet Another Anonymous coward Silver badge

          The big flashing red light and siren tell you that this room isn't on fire. It's the room where we turned out all the lights so you can see the fire more easily

      3. Spazturtle Silver badge

        As Jesus said "Let he who has never fucked up a dd command cast the first stone" or something like that.

        1. Anonymous Coward
          Anonymous Coward

          Yeets stone

  10. Snowy Silver badge
    Coat

    Test, test and test again

    Just do not test to hard or you may break it.

  11. Anonymous Coward
    Anonymous Coward

    Professionalism

    It's posts like this that make me question the education and experience of NASA "engineers" 10,000 wires point to point? I guess there are no standard methods of designing things... With that budget, can't they hire any competent, experienced electrical and mechanical engineers? It's like watching an episode of the office...

    1. jake Silver badge

      Re: Professionalism

      The 10,000 was hyperbole.

      However, it wouldn't surprise me if a toy like this had somewhere in the range of a few thousand individual wires for the care and feeding of it's multiple disparate parts. Note that it's not really "point to point", the wires are built as a series of harnesses, which are individually tested before installing on the machine.

      Yes, I know, that's a pic of Curiosity, not the much less complicated Spirit.

  12. John Brown (no body) Silver badge

    You can't kill a Spirit...

    ...you can only exorcise them :-)

  13. cageordie

    I hate breakout boxes.

    Make a proper cable. Test it before you use it. Always wonder "what is the worst that could happen" and assume that some day someone will do that.

    We have a $10,000 piece of hardware in each test setup that will last three minutes without cooling. When we originally made the rig we used diodes to supply power to a cooling fan from the pre and post launch PSUs so that we would always know it was cooled if it was powered. Two years with no incidents. Then there was a glitch in the program, mechanical had a redesign before other teams could continue. All the other engineers were assigned to new programs. After a year my new program was done and they needed me back on the original program. To my surprise there were now 3 PSUs. The new folks thought it was more important to be able to check the voltage and current on the cooling fan before powering on. We complained. We were overruled by the new chief engineer, because she trusted the people she knew and they had told her our way was wrong. A year later we have toasted $30,000 in hard to replace hardware, because people assume the cooling is always on, and they don't check. If we have a mains failure, common where thunderstorms and snow happen, the PSUs reset to 0 Volts and Amps. So one time someone checked the PSUs and found the cooling was not on, so they enabled it. Then they powered on, but nothing happened. They recalled the power settings and tried again. Three minutes later they stopped getting data from the subsystem. Yes, the cooling was powered, with 0V and 0A.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like