back to article Mars probe crippled by buggy SSD successfully jury-rigged

A space probe in orbit above Mars, crippled by a fault in its solid-state memory, has been brought back on line and is now once again handling scientific data. The Mars Express spacecraft, which has been orbiting the Red Planet since 2003, has been suffering problems since August in which it has repeatedly gone into "safe mode …


This topic is closed for new posts.
  1. Z-Eden
    Thumb Up

    Sir, we reconfigured the memory arrays

    Excellent stuff

    1. ~mico

      Now, bring the sensors back online.

      Good job, chief.

      1. James Hughes 1

        But remember

        "You canna change the laws of Physics."

        1. Marcus Aurelius

          Not forgetting

          Make it so, Number One.

        2. Darryl

          Dammit Jim

          I'm a doctor, not a BOFH

  2. Andy Farley
    Thumb Up

    Fixing IT problems IS boring.

    But fixing spacecraft rocks.

    You go Interplanetary BOFHs.

  3. amanfromMars 1 Silver badge
    Big Brother

    At last, ESA universal leadership .... Bravo. And now, for something completely different* ...

    How very convenient and perfectly timed for out of this world communications traffic in these oddest of future days.

    * Pinging AIVD/MIVD re CyberIntelAIgent Security and Intelligent Information Delivery Systems fax/copy/missive from Schiphol Base 1st September 2011

    If you would just humour this post, El Reg, and give it some breathing space on this thread, as it is beta testing some extremely sophisticated IP which has been gifted and is for gifting to any and all who are into SMART Future Virtual Reality Systems for Alternate Reality Gaming Systems of Operation, AI Command and Cyber Control.

    You surely know you are right at the front of the queue for insider scoop information on rapidly unfolding and virtually classified developments, which put the likes of a Big Brother into the shade.

    1. Jaymax

      It's file Jim

      ... but not as we know it.

  4. Paul RND*1000
    Thumb Up

    Makes me feel a little less smug about being able to remote login to my home PC while at work, a whopping 6 miles away, I'll tell you that much.

  5. Lee Dowling Silver badge

    So there solution is to swap it out to file storage? Pssh. Amateurs. Have they not heard of the BadRAM patches?

  6. Will Godfrey Silver badge

    I confess to being somewhat surprised that there wasn't already provision to page out bad memory.

    1. Anonymous Coward
      Anonymous Coward

      Page it out to what?

      The SSMM is effectively the "hard disk" for the spacecraft. File storage is the problem, not bad RAM per-se.

  7. Anonymous Coward
    Anonymous Coward

    Running a Sandforce controller?

    Maybe they should just upgrade their firmware...,13738.html


  8. Dick Emery

    Safe Mode?

    Is this thing running on Windows? That explains a lot.

    Had to be said.

    1. BristolBachelor Gold badge

      Your joke made me smile, but it's not quite Windows.

      Windows would display a blue screen and stop talking to anyone or doing anything until someone went over there and "turned it off and on again".

      No, this safe mode is more like "Oh fuck what the hell just happened!? Turn off everything that isn't essential (like Sky TV transmissions), and make sure that the antenna for reciving commands is pointing in the right direction so we can receive commands to do stuff, and turn the solar panels to get the maximum power in case things go pear-shapped."

      1. MacGyver

        I always wondered what would happen if an unlucky energized particle happened to hit an atom inside one of their storage devices. I'm just happy to have our wonderful magnetic field shield in place.

        Thank you again Earth's spinning iron core, for making our IT jobs a lot easier (and keeping us alive too).

        1. Eddie Edwards

          Happens here too

          "Studies by IBM in the 1990s suggest that computers typically experience about one cosmic-ray-induced error per 256 megabytes of RAM per month." - from Wikipedia's Cosmic Rays entry.

          Hence ECC RAM. The Cell has ECC on its internal 256K-per-SPU memory, which surprised me at the time but I guess that's the way things have to go. No doubt other modern CPUs have the same thing on their caches.

          On a probe, there would be many more cosmic ray events, so I suspect they've designed for it too. Probably using a combination of shielding and hardware ECC.

        2. BristolBachelor Gold badge

          In my project, every bit is stored 3 times, and then the 3 stored values are compared. An energetic particle will only be able to change the state of one of the 3, so majority voting sorts it out.

          In more critical systems, the 3 bits also have a delay after them that is different for each of the 3, so a transient in the power that afects all of them actually happens at a different time when the outputs are compared, so again it is cancelled out.

          And in one other system, the value of the bits is stored on a capacitance sooo big that even multiple strikes on any part of the circuit can't change the value stored (although that means re-programming values does not happen in ns!)

          In terms of here on Earth, the atmosphere makes the biggest difference to the type of particle that may cause problems, and hence high-flying aircraft suffer more than RAM on the ground.

  9. Anonymous Coward
    Thumb Up

    Awesome.. and more info ...

    "Fred Jansen, the Mars Express mission manager, said the spacecraft has recovered from its last safe mode event and successully completed initial testing of the workaround, which involves a new way of storing commands aboard the probe before they are executed.

    Instead of using a special file in the solid-state mass memory unit, the commands would be housed in a hardware-based timeline store outside the memory system, bypassing the issue believed to be the cause of the safe modes.

    Jansen said the Mars Express radar sounding instrument, named MARSIS, conducted test observations Monday with no problems."

  10. Annihilator
    Paris Hilton

    Safe mode charging

    So when it's in safe-mode, it needs to align the mirrors to keep the solar charging. And when it's fully up and running it powers itself... how?

    Genuine question!

    1. MacroRodent

      Re: Safe mode charging

      "So when it's in safe-mode, it needs to align the mirrors to keep the solar charging. And when it's fully up and running it powers itself... how?"

      With a charged battery? I guess the normal operation orients the spacecraft so that its instruments point towards Mars, and the solar panels receive less light occasionally. That is fine, it will run on batteries during those periods. In safe mode they don't want the risk of the spacecraft doing something silly that would eliminate the charging periods (like starting to point the panels to the opposite direction of the Sun) and then die for lack of power. Probably they also want ensure there is maximum power available for recovery attempts.

      1. jubtastic1

        This is an educated guess

        I would imagine in normal operations the craft is orientated with sensors towards the planet, like the moon or a coms sat for instance, so the solar array would still receive sunlight for periods of the day but would be at acute angles for most of the subside orbit. Safe mode I would guess sets the satalite in a tumble* so that the panels are always facing the sun while the instruments peer into space, planet, space, planet and so on.

        *Technically, in normal operations the satellite revolves around its axes, to keep the planet in view as it free falls round it, while in safe mode it stops revolving, which to an observer would look like its tumbling.

  11. AndrueC Silver badge

    But at least SSDs are faster than hard disks, right?


    Seriously though - nice work guys.

  12. alwarming
    Thumb Up

    Mars pathfinder priority inversion issue...

    Does this remind anyone of the Mars pathfinder issue when they have some debug code in the pathfinder which sent stack traces of the panic or something like that and a fix was uploaded ?

  13. Cloudscout


    How did you resist the urge to call them BOFH-ins?

  14. Wombling_Free


    Or whatever you favourite refreshment / stimulant is.

    Have some.

    I love hearing about epic spacecraft-recovery wins!

    Yay for boffins!

  15. Winkypop Silver badge

    Science, even at a distance...

    Still beats bronze-age beliefs in the hand....

  16. Michael H.F. Wilkinson Silver badge
    Thumb Up


    I think I will use this as an example of the awesome things computer science brings us when we have our next open day for potential students on coming Friday (apart from a host of other, more mundane examples)

  17. Mips

    Who said fixing IT problems is boring?

    It is boring, but not easy.

  18. Andus McCoatover

    More on this, please, El. Reg!

  19. Kurgan

    Fixing IT problems is boring?

    If you have a severe failure, a day or so of data that could be lost, and 150 workers that cannot work, well, this is NOT so boring.

    I believe that being a BOFH is like being a passenger airline pilot. You get months of boring work and then some really terrifying minutes (or hours) now and then.

  20. Anonymous Coward
    Anonymous Coward

    I wish my Windoze...

    ...would be able to do stuff like that in safe mode.

    You know, things like detection of something being borked and enable itself to INPUT COMMANDS and such, and not just stop there BSOD'ying.

    Without hitting ctrl-al-del first, or whatever.

  21. D.R.S.

    Howard's mother must be very proud

    ... even if he doesn't have a PhD.

  22. Cardinal

    Not again

    Maybe Howard's been allowing a "future Mrs Wolowitz" to drive the damn thing again!

This topic is closed for new posts.

Other stories you might like