back to article NASA’s radiation tolerant computer lives up to its name after surviving Van Allen belts

NASA has revealed its experimental Radiation Tolerant Computer has made it through the famously and furiously radiating Van Allen belts in one piece. The computer, known as the “RadPC”, went into space on January 15th atop a SpaceX Falcon launcher. NASA bought the machine a ticket on a mission run by Firefly Aerospace, which …

  1. Neil Barnes Silver badge
    Coat

    Lighter than air computer?

    Odd that it has to be mounted to the ceiling...

    1. Anonymous Coward
      Anonymous Coward

      Re: Lighter than air computer?

      This must be true as the mounting nuts aren't tight and it's still pressed firmly to the ceiling. And is that a tray of helium behind it?

    2. MrBanana Silver badge

      Re: Lighter than air computer?

      Looks like the photo has been flipped. Someone in the graphics department thought it better to have the logo and wording the right way up.

    3. werdsmith Silver badge

      Re: Lighter than air computer?

      Perhaps in some kind of lab where charged particles can be aimed at it.

      A picture of the thing unmounted here: https://science.nasa.gov/lunar-science/clps-deliveries/to19d-firefly/

      TIM in the Tomorrow People was mounted to the ceiling.

      1. Anonymous Coward
        Anonymous Coward

        Re: Lighter than air computer?

        TIM's flashing lights were - but we don't know where the rest of him was. He also had a "terminal" directly underneath.

        (I used to love that show...)

        1. werdsmith Silver badge

          Re: Lighter than air computer?

          Admirably inclusive before the days before it became a thing. True to its name.

    4. that one in the corner Silver badge

      Re: Lighter than air computer?

      You just have to look at the reflection of the bloke's head in the shiny film behind the box and ask yourself: is that someone looking down or craning their neck to look up?

      Answers on a postcard to the usual address.

  2. Pascal Monett Silver badge
    Trollface

    Two kilobytes ?

    Whatever happened to 640KB is enough for everyone ?

    1. Lazlo Woodbine Silver badge

      Re: Two kilobytes ?

      Guessing Windows 11 will be a little sluggish...

    2. GNU Enjoyer
      Unhappy

      Re: Two kilobytes ?

      Bill Gates didn't say that.

      He probably once stated that 640KiB was deemed to be more than enough for the companies malicious proprietary OS at the time and for the near feature (with the relevant computer and OS becoming obsolete before such memory limitation would become an issue).

  3. Ian Johnston Silver badge

    How much lead (or other) shielding would be needed to keep an ordinary laptop safe? I'm guessing rather a lot, because otherwise people wouldn't go all this trouble.

    1. Anonymous Coward
      Anonymous Coward

      I can assure you that screened Intel processor and FPGAs have made it out of orbit. You do need to go back a few generations to when the geometries were much larger.

      You have 2 ways of looking at this in hardware:

      Rad hardened processors, memory, etc.

      or multiple redundant systems (preferably in clock-step) and voting

      In software you have multiple instances of variables, a High Reliability File System, continuous test checking the multiple instances of code (in RAM and on storage), swapping of memory blocks to avoid hard errors, and many other techniques (that you have to pay lots of money for, if you really need it).

      Relying on shielding is not good enough. FFS neutrinos travel through the Earth unimpeded and some other particles are nearly as interesting. However, if some of these particles hit your charged memory gate or transistor in the processor, then we are talking permanent damage.

      Well within the Van Allen Belts, a typical trans-Atlantic flight with see 7 corruptions of bits in a 1Gbit memory device (again we are going back a few generations). Nearly all with be recoverable using ECC, but permanent damage is still quite common, and both will invariably get worse as geometries reduce further.

      Anon and purposely vague, as this sort of knowledge does not come cheap!

      1. aks

        AFAICR the Space Shuttle used the same voting concept plus a differently designed and programmed box that would take over if all else failed.

        1. MacroRodent

          voting computers

          The idea is old, I too recall hearing about it in connection with the shuttle. Always wondered what happens if the voting circuitry itself is hit.I guess it just has to be made super robust.

        2. John Smith 19 Gold badge
          Boffin

          "a differently designed and programmed box that would take over if all else failed."

          Not quite.

          The backup was another IBM 4Pi. IIRC the main systems used event driven interrupts but the backup ran a series of tasks in a round-robin scheduler. Restricted functionality but adequate to get you home.

      2. Mishak Silver badge

        multiple instances of variables

        Aren't always helpful, depending on the compiler/language that's used - it's much more reliable to use hardware to detect errors than to hope that the code can do it for you.

        For example, this code* does not call it's error handler if the two copies of a value do not match.

        * yes, it's a very contrived example, but it shows what can happen.

      3. Justthefacts Silver badge

        Broadly correct, but…..[caveat, disclosing and boostering my IP in this field]

        Most discussion of space radiation effects tends to focus on Single Event Upset as the major problem to be solved, with attendant error correction, triple majority voting etc. And while this is true, the techniques are by now known for decades, widely implemented. It’s a solved problem.

        The harder issue is Single Event Latchup: a stray proton causes both sides of a transistor to switch on, and they *stay switched on indefinitely* sinking a large amount of current to ground, and ultimately burning out the transistor. The only recovery technique is to detect and power-cycle which needs to happen with a few milliseconds to tens of milliseconds, and you are looking for “unexpected” current spikes at the micro amp level on top of dynamic currents of amps.

        Conventionally the only solution to latchup is believed to be re-implementing the CPU (and RAM chips!) on custom-built silicon technology, which has to be Silicon On Insulator.

        Working on this, me and my team came up with a scheme to reliably detect and recover latchup on off-the-shelf CPUs. We’ve demonstrated it on four different CPU manufacturers, requiring no modifications to either silicon or packaging, and have test results both from proton beamline and in-space operating data. Intel, AMD, Qualcomm Snapdragon ARM and Infineon ARM were all proven. This was all done over 15 years ago.

        Unfortunately, neither European Space Agency nor Airbus Space could be interested in the project, because it contradicted their preconceptions and they “knew it was impossible”. We showed them hard data and they just refused to even trial it.

        The project and data are still in my drawer, it was a few months of my life, and several hundred k of my own money I’m never getting back.

        For what it’s worth, SpaceX Starlink came up with a similar solution for their satellites a few years after we did. I have no reason to think they stole our work, it’s convergent evolution, and at this stage of owning my own company in a somewhat different area I have no interest in getting into court with SpaceX to enforce our IP anyway. But European Space Agency and Airbus still believe “it can’t work” so their CPU boards still cost them over a million dollars each rather than our solution costing $30k for a CPU with 10x performance.

        What can you do. Some organisations just can’t be helped.

        1. John Smith 19 Gold badge
          Thumb Up

          " Single Event Latchup: "

          Excellent point.

          Effectively a dead short across the power supply, destroying the gate and likely a cascade of other failures.

          You did this with OTS chips? Not requiring additional parts baked into the wafer?

          1. Justthefacts Silver badge

            Re: " Single Event Latchup: "

            Yes, no hardware modification, pure fast-detect-and-power-cycle in a separate dedicated chip. We have patent so I can disclose.

            Roughly speaking, once you have triplicated for SEU, and running three CPUs in tight lockstep, you now have three current traces that decouple out the instruction-flow noise. It becomes much easier to spot the odd-one-out transistor-sized short current. It’s still a fairly taxing algorithm, but hardened in a pure state-machine in 90nm (!!) rad-hard technology it’s a small ASIC.

            Three triplicated COTS CPUs implemented in modern 3nm technology are just insanely cheaper, higher performance and more power efficient than one space grade CPU implemented in rad-hard 45nm or even 28nm space leading-edge.

            And even more importantly than that: “the space industry way” is to implement all their algorithms in a space-grade chip, paying $10M+ for a total market size of a couple hundred chips. And then 5-10 years later spend all the same ESA money for the next silicon generation, because chip design is hard, just to double performance of same algorithms.

            Instead of just writing it all *once* in software, and porting it to a newer Intel, AMD or whatever chip, for a “free” performance doubling every chip release, which they don’t have to pay for. The performance penalty of putting heavy signal-processing in software rather than hardware is certainly significant…..but it’s just totally obliterated by space silicon tech being at least 8 generations behind leading-edge terrestrial silicon. Last I heard, 28nm STMicro was the aspiration for their next generation, which might still be some years away, that’s how broken this is.

            1. John Smith 19 Gold badge
              Unhappy

              " and running three CPUs in tight lockstep,"

              This I think is where it gets tricky.

              When NASA upgraded their SSME engine controller due to timing constraints they paid Motorola to create a special dual-M68000 chip to satisfy their cross checking requirements.

              I'm hoping that was a special case.

              In theory 3 processors running the same algorithm should all run in exact lock step.

              But if you're running 3 strings for redundancy you've almost certainly got 3 sensor strings producing raw data as well.

              Natural variation in those readings could (will?) cause a certain amount of misalignment over time. However I'd also expect under normal circumstances those would also drift back into alignment.

              IDK if they could be big enough to trip one of the strings out as faulty.

              That's where it gets tricky. *

              * Like the hardware state machine. Very robust. Eight generations behind current fabs? I thought it was no more than 1-2?

  4. Crypto Monad

    Glad to see it has a DB25 serial port, as all proper computers should have

    1. werdsmith Silver badge

      Which very usefully offers some scale to the photo.

      1. seven of five Silver badge

        Certainly better suited to judge the size than "… about the size of a slice of bread.”

        What does anyone in the US know about bread anyway?

        1. ArguablyShrugs

          Bread |bred| ('polystyrene' in British English)

        2. el_oscuro
          Boffin

          Those of us who have access to El Reg's handy standards converter know that a slice of bread is approximately 0.0005 NanoWales.

          https://www.theregister.com/Design/page/reg-standards-converter.html

    2. Caver_Dave Silver badge

      Very strange, as they are not very reliable in a high vibration environment - such as a rocket during launch.

      1. werdsmith Silver badge
        1. Gene Cash Silver badge

          OK, not even NASA knows how to properly use Loctite. I guess I really am the only person to actually read the damned instructions then.

        2. John Smith 19 Gold badge
          Thumb Up

          "https://llis.nasa.gov/"

          Ahhh, NASA's lessons learned database.

          Possibly one of NASA's least known but (potentially) most valuable resources. *

          Because not only does s**t happen but the odds-on bet is it has happened repeatedly before.

          *Along with it's "Space Mechanisms" conference proceedings. These provide a staggering array of geeky gadgetry to solve all manner of issues with making mechanical stuff work in space.

  5. John Smith 19 Gold badge
    Coat

    “4 inches square and 0.5 inches thick … about the size of a slice of bread.”

    Hmm.

    That's quite a slice.

    The joker in the pack with all rad-hard stuff is what's the voting hardware made out of?

    Because of course if the voter hardware is stuffed a possibly good processor is dumped for a bad one.

    And 4 processors? I'm guessing one of them is a cold backup to replace the one that's outvoted.

    1. Justthefacts Silver badge

      Re: “4 inches square and 0.5 inches thick … about the size of a slice of bread.”

      Two valid options for the voter, depending on circumstance.

      a) A rad-tolerant space-grade pre-qualified FPGA (or even rad-hard, but Atmel are unbelievably shit). expensive unit cost, but don’t have lots of upfront development NRE and timeline risk

      b) Small rad-hard ASIC, can be cheap unit cost, if your program might need even hundreds or thousands….but then there’s high NRE and timeline/qualification risk to develop.

      1. Justthefacts Silver badge

        Re: “4 inches square and 0.5 inches thick … about the size of a slice of bread.”

        Curious to know who the super-dedicated phantom downvoter is, and their reasoning.

        Presumably Lars, I guess. You must spend literally an hour a day checking and re-checking this board, to ensure you downvote my post within an hour, for the past two years. It’s your life, but it’s a sad waste of a life. And I don’t really take it personally, sorry to tell you. Anyhoo.

      2. John Smith 19 Gold badge

        "Two valid options for the voter"

        Most of my knowledge comes from the Shuttle GPC's, which obviously are at low end of exposure barring solar flares.

        I remember the GPC's each wrote a code but I've got a hazy recollection that there was no voting hardware. The GPC's read each others codes and drove a set of lights. It was then up to the commander to decide if they were going to power down one of them, or (worst case) switch to the backup running the totally different software.

        IIRC SX uses dual redundant ARM's as the processors and watchdog timers. I suspect there are quite a lot of options if you don't have to operate to govt rules and can accumulate statistics showing it remains reliable in LEO.

        However things get much more serious once you're into the inner belt and beyond.

        It's why I've always liked asteroid capture.

        Sure, riding around the solar system in a hollowed out rock looks a bit clumsy, but nothing says "Radiation protection" like metres of solid rock.

  6. Annihilator Silver badge

    Slice of bread standards

    One of my high item points on my political agenda if I ever venture that way would be to force bread companies and toaster companies to finally get their act together and agree on the size of a slice of bread...

    On a more serious note, doesn't ever extra-terran mission pass through the van allen belts broadly successfully? There are plenty of landers, rovers and satellites exploring every planet in the solar system that have presumably made it through unscathed and with the right levels of protection, what's significant about this one?

    1. cray74

      Re: Slice of bread standards

      Depending on the destination and launch window, passage through the van Allen Belts can be minimized or avoided. Crewed Apollo flights skimmed the low energy fringes while insisting the crews stay in the relatively thick-skinned command module for the Belt transit. Interplanetary probes could use the same technique. It seems like Blue Ghost was more deliberate in targeting the high-energy core of the belts.

      1. Annihilator Silver badge

        Re: Slice of bread standards

        Yeah fair enough, just hadn't seemed like the Van Allen belts were a significant blocker to any destination we'd decided on so far, but maybe it allows for more optimal trajectories.

  7. osxtra

    Up To Spec

    You can tell this is a Government machine by the DB-25 port. I'll bet it's used to connect the Space Printer.

  8. John Smith 19 Gold badge
    Unhappy

    *every* geo comms sat has been through the Van Allan belt

    Geo orbit is about 5.6 Re (earth radii) The belt is roughly 2.8-5.0 Re with doses of 0.01-0.05rads/second.

    IOW 1 hour = 180rads. Current standards reckon 450rads will kill 50% of those exposed, 250rads is reckoned to kill 1%.

    This makes orbit raising, where you use low power thrusters like ion drives to progressively expand the orbit to the correct altitude, a poor plan for human carrying vehicles.

    Incidently there is also an inner belt which starts around (IIRC) 1000miles out. It's just above the orbits of the GPS constellation, which are placed where they are to deliver maximum coverage with the fewest satellites without precision altitude control while giving long life.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like