back to article Airbus A350 software bug forces airlines to turn planes off and on every 149 hours

Some models of Airbus A350 airliners still need to be hard rebooted after exactly 149 hours, despite warnings from the EU Aviation Safety Agency (EASA) first issued two years ago. In a mandatory airworthiness directive (AD) reissued earlier this week, EASA urged operators to turn their A350s off and on again to prevent " …

  1. Tromos

    "...need to be hard rebooted after exactly 149 hours"

    Not good if the 149 hours is up while on final approach. I'm pretty sure you wouldn't have to wait for the full 149 hours otherwise this would cause MAX sized problems with scheduling.

    1. big_D

      Re: "...need to be hard rebooted after exactly 149 hours"

      You look at the maintenance schedule, calculate in the next flight (plus delay/headwind/circling time) and if it will be near or exceed 149 hours, shut everything down.

      1. Yet Another Anonymous coward Silver badge

        Re: "...need to be hard rebooted after exactly 149 hours"

        It should reboot before 'Exactly' 149 hours.

        I'm not sure I would fix the bug. This is a known feature, the planes all go through a maintenance schedule more often than this. It may be safer to make a reboot part of the maintenance rather than introducing a software change

        1. ps2os2

          Re: "...need to be hard rebooted after exactly 149 hours"

          So, let me get this straight. How many software engineers are going to be needed to accomplish this? 10? 500? This sound like a great job place a software engineer in every first-class cabin and let the soak up the alcohol till they are snookered. So then they can reboot while under the control of alcohol.

          1. a handle

            Re: "...need to be hard rebooted after exactly 149 hours"

            I'm in :-)

        2. christooo

          Re: "...need to be hard rebooted after exactly 149 hours"

          Good thinking Berger!

    2. 's water music

      Re: "...need to be hard rebooted after exactly 149 hours"

      Not good if the 149 hours is up while on final approach

      yeah, I wouldn't start the reboot below FL300 to be on the safe side

      1. Mark 85

        Re: "...need to be hard rebooted after exactly 149 hours"

        yeah, I wouldn't start the reboot below FL300 to be on the safe side

        Since the plane needs a complete reboot/power down, I don't think I'd want to do that while flying at any altitude.

        1. Anonymous Coward
          Trollface

          Re: "...need to be hard rebooted after exactly 149 hours"

          I'm sure the passengers won't mind when everything goes dark and silent, and they hear the engines restarting one by one. They could do it when people are sleeping.

          1. UncleZoot

            Re: "...need to be hard rebooted after exactly 149 hours"

            That would be a big IF the engines do a restart.

            I can imagine making an Atlantic crossing in a storm, run out of time, everything goes black and the flight crew fighting to reboot the computers and get engines restarted in the rain while traveling @ 425kts.

            Once the engines shutdown, all heat for the control surfaces no longer exists so the leading edge of the wing, horizontal and vertical stabilizers begin to freeze. This would not be a good day.

        2. spold Silver badge

          Re: "...need to be hard rebooted after exactly 149 hours"

          Particularly since it involves going outside and using a paperclip to push the little button in the small hole in the nose.

    3. John Smith 19 Gold badge
      Unhappy

      Like the Multiplexer DeMultiplexer boxes on the space shuttle.

      The collect multiple data sources (and probably provide any driving signals for things like strain gauges) and format the data.

      However the MDM's didn't run programs (although I think the later model running on the ISS do)

      As embedded flight avionics that suggests they should be under the full DO178b style development process. Nothing less.

      Incidentally the first time I heard one of these timer overflow bugs was related to Patriot missile batteries in the 1990 Gulf War for extended periods of time (I heard about it much later)

      So it's not exactly an unknown failure mode.

    4. tip pc Silver badge

      Re: "...need to be hard rebooted after exactly 149 hours"

      Just reload it every 120 hours, leaves plenty of margin for delays etc to be well inside the 149 hours.

      1. Anonymous Coward
        Anonymous Coward

        Re: "...need to be hard rebooted after exactly 149 hours"

        "*exactly* 149 hours" !

  2. Anonymous Coward
    Anonymous Coward

    Cue the usual Scotty and the USS excelsior comparison about over complexity.

    1. TRT
    2. Anonymous Coward
      Anonymous Coward

      "Cue the usual Scotty and the USS excelsior comparison about over complexity."

      Sometimes wannanted IMO. Does the steering control for the nose gear *really* need computer control? Whats wrong with analogue wiring to a steeting motor and a couple of max lock cut out sensors? Seriously.

      1. Alan Brown Silver badge

        "Whats wrong with analogue wiring to a steering motor and a couple of max lock cut out sensors?"

        That depends if it can detect and warn that the nosewheel isn't straight before it touches down and if the sensitivity can be adjusted with speed. You do NOT want a sneeze on roll (landing or takeoff) putting you in the bushes or snapping the tyres off.

        More importantly a separate control path to the nosewheel would mean yet _another_ set of controls in the cockpit and pilots on the ground suffer from operational overload half the time anyway. That's why those checklists are so critical.

  3. Neil Barnes Silver badge

    Turn it off and turn it on again...

    Do they have to close all the windows as well?

    1. simonlb Silver badge

      Re: Turn it off and turn it on again...

      I hope not, that would be a real pane.

      1. BebopWeBop
        Trollface

        Re: Turn it off and turn it on again...

        And might be tricky in most real planes?

  4. Test Man

    So do all the pilots have to read the counter as part of their pre-flight checks in order to make sure that the plane switched-on time doesn't exceed 149 hours? And what happens if the plane is delayed literally when it's on the runway - do the pilot have to recalculate while waiting and if it's likely to exceed while they are in operational command do they have to turn the aircraft around, park again and then go through the switch-it-off-and-on-again procedure?

    1. b0llchit Silver badge
      Pint

      Automatic switch procedure

      The system will switch itself off if they forget to check or run into a too long delay. The only problem may be the fireball that causes the switching off. There are no known sparks involved in the switching off procedure and the switching on procedure expires when switching off is done by the system itself. Obviously, YMMV is appropriate. Please ensure to cash-in those miles before the switching off occurs automatically.

    2. Dan 55 Silver badge

      Presumably they include extra time for how long the next flight could last in their calculations, just as they do with fuel.

      1. big_D

        One would hope so...

      2. Mark 85

        Presumably they include extra time for how long the next flight could last in their calculations, just as they do with fuel.

        Presumably they would. But then there's Murphy who doesn't play by the rules.

        1. Bob Magoo

          I'm reliable informed that Murphy failed his plane driving test.

    3. Anonymous Coward
      Anonymous Coward

      Is the counter

      Even accessible to pilots, or is it just some hidden value no one can see so the maintenance guys will need to put a stick in the windshield like the ones Jiffy Lube puts in the upper left of your car's that tells you what mileage you should change the oil at next.

    4. katrinab Silver badge

      Well it mustn't exceed 149 hours before it lands at its destination, so if delays are going to make a difference you should reboot anyway. You don't want it to fall out of the sky while circling round London waiting for a landing slot at Heathrow.

      1. Tim Bates

        Can't exceed 149 hours before it is *parked*. They still need some of those systems to taxi, and airports handling larger aircraft are usually too busy to just leave it sitting on a runway exit for 30 minutes to get started again.

  5. Cuddles

    Why is there a choice?

    "The remedy for the A350-941 problem is straightforward according to the AD: install Airbus software updates for a permanent cure, or switch the aeroplane off and on again."

    As far as I can tell from this, the issue has been fixed - a patch is available and all airlines need to do is install it. So why is the "turn it off an on again" thing even being mentioned? Surely with a potentially safety critical problem like this, it should be a simple case of grounding all aircraft until the patch has been applied.

    1. Jens Goerke
      Devil

      Re: Why is there a choice?

      No time for maintenance, we're losing money while the bird is on the ground.

      1. Steve Davies 3 Silver badge
        Childcatcher

        Re: Why is there a choice?

        Memo... Insurer to Airline

        Install the patch or you are NOT insured.

        Yours

        Moneybags Insurance GMBH

        Seriously, Aircraft need regular and importantly periodic inspections That should be more than enough time to install the update.

    2. JimBob01

      Re: Why is there a choice?

      hmm… imagine you are waiting to board your plane when an announcement is made

      “There will be a short delay to boarding as the technician carries out some maintenance. We appreciate your understanding and hope to continue boarding as quickly as possible”

      Wait 20 minutes…

      “I have an update on the delay to boarding. It seems the technician has bricked the plane!”

      I would guess that the patching only becomes mandatory at the next planned service of the plane so that the process can be properly planned. Up until that point, cold rebooting every 100 hours should be sufficient.

      1. Sgt_Oddball
        Coffee/keyboard

        Re: Why is there a choice?

        Having been on a Fokker suffering software issues that required a call being passed to 2nd line phone support to figure it out (it was turned off and back on again but not everything came back up...) I can tell you it wasn't 20 minutes later that they'd bricked it and an hour later they finally got it to start up right.

        Escape because they wouldn't let us get off..

        1. Doctor Syntax Silver badge

          Re: Why is there a choice?

          I've been stuck waiting for a plane where they found the fuel connection wouldn't cut off so they couldn't remove the hose. Very difficult to take off with the tanker still attached.

          1. Pascal Monett Silver badge
            Trollface

            Very difficult ? Nah. Very eventful, though.

        2. PC Paul

          Re: Why is there a choice?

          What makes you think the doors would open?

        3. Anonymous Coward
          Anonymous Coward

          Re: Why is there a choice?

          Bloody Fokkers

        4. bpfh
          Joke

          Re: Why is there a choice?

          So they bricked the little Fokker?

      2. Stoneshop
        Facepalm

        Re: Why is there a choice?

        “There will be a short delay to boarding as the technician carries out some maintenance. We appreciate your understanding and hope to continue boarding as quickly as possible”

        Wait 20 minutes…

        "The technician has found he doesn't have the right adapter cable. One will be sent from Toulouse at the earliest opportunity. We're sorry for the further delay."

        1. I ain't Spartacus Gold badge
          Happy

          Re: Why is there a choice?

          "The technician has found he doesn't have the right adapter cable. One will be sent from Toulouse at the earliest opportunity. We're sorry for the further delay."

          The technician has now begun the process of installating from floppies, 432 of them, but is unable to find the spacebar...

          1. OssianScotland
            Coat

            Lemon Scented Paper Napkins

            No further comment needed

            Icon - there's a frood who knows where his towel is

          2. keith_w

            Re: Why is there a choice?

            "The technician has now begun the process of installating from floppies, 432 of them, but is unable to find the spacebar..."

            I suppose that is better than not being able to find the "any" key.

          3. Anonymous Coward
            Anonymous Coward

            Re: Why is there a choice?

            > The technician has now begun the process of installating from floppies, 432 of them, but is unable to find the spacebar...

            The spacebar was removed at the same time they removed the cattle-class legroom...

            1. taxythingy

              Re: Why is there a choice?

              It was the heaviest key, so made sense to the bean counters.

          4. Flywheel

            Re: Why is there a choice?

            "Bad Sector Error Disk 65 - Abort/Cancel?"

          5. elkster88
            Boffin

            Re: Why is there a choice?

            Believe it or not, one of the ARINC 429 dataloader boxes in my lab loads flight software for some of the avionics that my company supplies to Airbus and Boeing using... wait for it...

            3-1/2" 1.44 MB floppies.

            At least it uses "modern" floppies, not the 5-1/4" or 8" floppies used by some of the older (but still operational) equipment that's sitting on the next shelf.

            1. katrinab Silver badge

              Re: Why is there a choice?

              Does anyone still make them?

              I searched on Viking Direct [UK branch of Office Depot] and it says it has never heard of such a thing. They used to sell them back in the day, but presumably all the people around then have died off or retired.

          6. RRJ
            Boffin

            Re: Why is there a choice?

            Thought that was the "ANY KEY"

            1. Anonymous Coward
              Anonymous Coward

              Re: Why is there a choice?

              Hey! Where's the ANY key ?

              1. Adrian Harvey
                Coffee/keyboard

                Re: Why is there a choice?

                > Hey! Where's the ANY key ?

                It's the big blank one with no letters on it.

          7. Remy Redert

            Re: Why is there a choice?

            The technician has just found that floppy 412 is unreadable. Unfortunately that means the software will have to be completely removed and reinstalled from the back up. We've been assured that this shouldn't take more than 3 working days.

        2. werdsmith Silver badge

          Re: Why is there a choice?

          My experience was a 787 Dreamliner at Heathrow (AA) where the captain announced that some systems would not come on line and so a complete power off and on was required. We had to all disembark back to departure lounge because the captain wasn’t happy for us to be on board an aircraft with no power.

          1. AndrueC Silver badge
            Meh

            Re: Why is there a choice?

            I was on a flight out of Birmingham a couple of years ago that got held on the apron. We sat there for over an hour while engineers came and went. Eventually they closed the door and restarted the taxi. The captain announced that although they couldn't fix the fault the plane was cleared to take off.

            So..worth waiting an hour to try and fix but not bad enough to stop us flying.

            Hmmm.

          2. Anonymous Coward
            Anonymous Coward

            Re: Why is there a choice?

            >the captain wasn’t happy for us to be on board an aircraft with no power

            I wouldn't be happy to be on an aircraft with no power either. They tend to fall out of the sky that way.

            1. Spanners
              Pint

              Re: Why is there a choice?

              I wouldn't be happy to be on an aircraft with no power either. They tend to fall out of the sky that way.

              They should look back in history. I specifically remember that Chipmunks don't!

            2. Stoneshop
              Holmes

              Re: Why is there a choice?

              I wouldn't be happy to be on an aircraft with no power either. They tend to fall out of the sky that way.

              a) For them to fall out of the sky they have to be in it. Sitting on the apron or on a taxiway there's not really a big opportunity for that.

              b) Sully, the Gimli Glider crew and the BA flight 9 crew, among others, may feel the urge to disagree.

      3. Down not across

        Re: Why is there a choice?

        hmm… imagine you are waiting to board your plane when an announcement is made

        “There will be a short delay to boarding as the technician carries out some maintenance. We appreciate your understanding and hope to continue boarding as quickly as possible”

        Wait 20 minutes…

        Display shows:

        Update is 100% complete. Please don't turn off your airplane

        [some spinner on the display rotating]

        2 hours later

        The display is still showing same message.

        1. Ripper38
          Coffee/keyboard

          Re: Why is there a choice?

          Be funny if the reboot sequence showed up on say 253 multimedia screens... Be even funnier if there was a technical support engineer onboard...

      4. A.P. Veening Silver badge

        Re: Why is there a choice?

        Up until that point, cold rebooting every 100 hours should be sufficient.

        I would make that something like 120 or even 144 hours (five or six whole days), four days and four hours seems inconvenient.

    3. SkippyBing

      Re: Why is there a choice?

      Generally depending on the classification of the problem the authorities grant some leeway in applying the AD. E.g. within next 28 days which gives the airlines some flex to incorporate it into the next scheduled down time. In this case they probably believe that the interim measure is safe enough that it doesn't require the AD to be applied immediately.

    4. Electronics'R'Us
      Holmes

      Re: Why is there a choice?

      The update (patch) would not be applied by either the airline or Airbus; the kit itself will have been designed and certified by a 3rd party avionics house who will also have done the low level software (which would implement the actual communication links) which is what seems to be the issue here.

      To update, the equipment would need to be replaced and the older units sent back to the manufacturer for software load and testing.

      It takes time to replace units so this would be done at the next scheduled maintenance point.

    5. veti Silver badge

      Re: Why is there a choice?

      You know how much disruption it causes, to hundreds of thousands of people, when a whole fleet of planes is grounded? Even briefly (and we don't know how brief it would be)?

      That's an order that only goes out when they find something really dangerous. This hazard is easy to manage, once you know about it. Indeed, if it's been in service for two years without anyone noticing, that suggests it's pretty easy to manage even if you don't know about it

    6. greatpix

      Re: Why is there a choice?

      If the hard reset works that should be preferable to a software fix. Why? Because, as anyone involved in the development cycle knows, fix one thing and that patch is liable to break something else that only shows up once the user has it hands on. It is well understood that no matter how many QA tests are run, in the field the user will ALWAYS execute some action that the testers, in their wildest imagination, would never consider happening.

  6. mj.jam

    What is overflowing?

    Any ideas about what is overflowing? 149 hours of seconds doesn't seem to be that obvious a limit, but I guess they probably have rounded down a little to stop planes falling out of the sky.

    I've seen issues similar to the Boeing one turn up in less critical places. Found my customers since in internal testing no system was left up for long enough.

    1. Dave Pickles

      Re: What is overflowing?

      149 hours is near-enough 2^29 milliseconds.

      1. mj.jam

        Re: What is overflowing?

        That makes it sound like they were trying to allocate individual header bits to different fields. So 28 bits only would give them 74 hours, but that wasn't enough, and 30 bits gave 300 which would never be needed, so they choose 29 bits.

        I guess they then didn't write any test cases for overflow. I can imagine the problem is that they haven't wrapped the comparison operation correctly. So the newest data ends up looking very old.

        1. Anonymous Coward
          Anonymous Coward

          Re: What is overflowing?

          Test case ? We're doing agile software development here.

          1. Baldrickk

            Re: What is overflowing?

            Agile =\= no tests

            1. eldakka

              Re: What is overflowing?

              In theory, correct.

              But in practice, which is the only thing that matters?

              1. Baldrickk

                Re: What is overflowing?

                That your code works, which you prove with the tests.

                Still trying to get TDD to become the norm here, but at least we do write the tests. It helps that we need the test evidence to pass our milestones and actually release anything.

          2. Stoneshop
            Devil

            We're doing agile software development here.

            Which is not that different from "It compiles, ship it", only with shorter release intervals

        2. hellwig

          Re: What is overflowing?

          ARINC 429 messages are 32-bits. 8-bits for the label, 2 bits for the SSM (signed status matrix), 2 bits for the SDI (source/destination indication), 1 bit for parity, That leaves 19 bits for data which is 536,288 (2^19).

          There are 536,400 seconds in 149 hours. So they're sending time since power-up as a 19-bit value in seconds, and it overflows just before 149 hours has hit.

          1. mj.jam

            Re: What is overflowing?

            Sounds plausible, but 2^19 is 524288, so is under 146 hours.

            The 2^29 works, so AFDX must have changed that.

          2. John Smith 19 Gold badge
            Coat

            2^19 is 524288

            Which is 146 hours

            So something's a bit screwy here.

        3. Boy Quiet

          Re: What is overflowing?

          Many years ago I was a software programmer and the ICL 1904 I was using at a client site stopped working. We were much closer to the hardware in those days.

          The engineers ran a test and I poked my nose in asking about the results. I deduced addition was faulty, and persuaded them to do an addition (on the switches !! ) they did and told me I was wrong.

          8 hours later they replaced the addition unit and all worked. when I asked what they had added it was 1 & 1 Had I specified FFFF and 1 it would have shown the problem - carry in bit 8 was faulty!

          Still reminds me to specify exactly what I want when testing.

      2. Robert Carnegie Silver badge

        Re: What is overflowing?

        The story says "exactly" 149 hours. If I'm following, it also says that if you've installed the software patch to fix the issue... well done but you still have to do the reboots every 149 hours.

      3. caffeine addict

        Re: What is overflowing?

        What am I missing? Why do the systems need to know the milliseconds since it was started, rather than the milliseconds past an arbitrary time? Something that could reset every time the umbilical was unplugged, 00:00 UTC went by, or the pilot got up to shag a steward(ess)?

        The black box needs precise timings. The internal indicators just need to be reply in a timely fashion. No?

      4. DugEBug

        Re: What is overflowing?

        It's what happens when you flush an overloaded toilet, but that's not important right now.

        RIP Leslie Nielsen

    2. Lazlo Woodbine Silver badge

      Re: What is overflowing?

      Windows 98 used to crash after 49.7 days due to a timing chip error

      1. simonlb Silver badge

        Re: What is overflowing?

        I never trusted Windows 98 enough to leave my PC on overnight, let alone 49 and a bit days!

      2. fishman

        Re: What is overflowing?

        So Airbus is almost 3X better than Windows 98.

        1. Anonymous Coward
          Anonymous Coward

          Re: What is overflowing?

          You are mixing days and hours there. Please don't write any software for aircraft. Thank you.

          1. eldakka

            Re: What is overflowing?

            ...or interplanetary probes.

      3. Pascal Monett Silver badge

        Re: Windows 98 ?

        I have never seen a Windows 98 that managed to last a week, let alone 49 days.

        XP SP3 was much better behaved, but it still had trouble getting through a month in a single stretch.

      4. Chemist

        Re: What is overflowing?

        "Windows 98 used to crash after 49.7 days due to a timing chip error"

        Time must have run very differently for you

    3. KarMann Silver badge

      Re: What is overflowing?

      It comes to almost exactly 2^29 milliseconds, give or take 8 minutes, so I'd think that's what the limit comes to. Why not 2^31 or 2^32, I've no idea. Not enough extra bits for a time zone or anything like that. Could be used for something else mysterious to me.

      1. Sandtitz Silver badge

        Re: What is overflowing?

        2^31 would be ~600 days. Is it 100% certain that no plane ever is continuously on for two almost 2 years?

        I sure hope the fix wasn't raising the time to 2^30 msec... :-D

    4. Anonymous Coward
      Anonymous Coward

      I wouldn't assume it is something neat like milliseconds

      It could be counter that counts cycles on some CPU or bus somewhere to generate a unique 'event' timestamp, and if it happens to be clocked at 333.625 MHz then it would overflow a 32 bit value in exactly 149 hours (though that "exactly" is probably rounded down from 149.something)

      1. Simon Harris

        Re: I wouldn't assume it is something neat like milliseconds

        Not sure where you get 333MHz from, I get 8kHz (+ a few Hz) to roll over 32 bits in 149 hours.

  7. The Man Who Fell To Earth Silver badge
    WTF?

    Are the software updates free or does Airbus screw you to fix their screw up?

    "The remedy for the A350-941 problem is straightforward according to the AD: install Airbus software updates for a permanent cure, or switch the aeroplane off and on again."

    The remedy for the A350-941 problem is straightforward according to anyone with morals & a brain (& a healthy fear of litigation): install free Airbus software updates for a permanent cure.

    FIFY

    Let's hope the patch doesn't just automatically turn the plane off & on at 148 hours.

  8. Boris the Cockroach Silver badge
    Mushroom

    lucky the patch

    is'nt by microsoft

    "Caution: update needed, if you choose to delay the update it will automatically install and reboot your flight controls in 7 days , no further warning will be given"

  9. SkippyBing

    Missile Bug

    Reminds me of one of my favourite buffer over run stories. A missile was being developed, possibly AMRAAM I can't remember off-hand, and they had a problem with over runs. So in a move of genius they installed twice as much memory as would be needed in the longest possible flight, solving the problem.

    Some years later they produced an improved range variant of the missile, predictably they forgot why they'd installed so much memory in the first place...

  10. Anonymous Coward
    Anonymous Coward

    6 days ! WTF ?

    If I'm reading this correctly, 6 days with no power down = a severely crippled A350.

    It went to service in 2015, and still in 2019, there are planes with this flaw ??

    It shouldn't even have passed QA, in this state ! I remember EMC was not shipping Symmetrix (significantly cheaper than an airliner) without 3 weeks running flawlesslly, one in a cold room (0 degree C), one in a 20 C room and one in a 40 C room ! Do I understand correctly airplanes, those days, don't even come close to the level of QA from EMC, 2 decades ago ?

    I'm not sure I want to board any airliners anymore ...

    1. Phil O'Sophical Silver badge

      Re: 6 days ! WTF ?

      remember EMC was not shipping Symmetrix (significantly cheaper than an airliner) without 3 weeks running flawlessly,

      I'd expect disk drives to run flawlessly for years without a reboot, so three weeks testing isn't much. It wouldn't have found this EMC problem https://www.theregister.co.uk/2014/01/15/vnx2_reboot_issue/ for which the solution was, yes you guessed:

      1. Reboot SPA

      2. Wait 30 min

      3. Reboot SPB

      before 80 days had passed.

      How many airliners never actually get powered down completely sometime in every 6 days?

      1. Anonymous Coward
        Anonymous Coward

        Re: 6 days ! WTF ?

        The VNX is not Symmetrix, it is their midrange product. The VNX is to the Symmetrix what an ATR regional jet is to the A350.

        1. Phil O'Sophical Silver badge

          Re: 6 days ! WTF ?

          The VNX is not Symmetrix, it is their midrange product. The VNX is to the Symmetrix what an ATR regional jet is to the A350.

          I know, I have a lot of experience with Symmetrix boxes and they are great kit, but it doesn't change the principle. Three weeks untroubled testing for something that is expected to run continuously for years, whether VNX or DMX, doesn't really have any relation to the kind of testing an ATR or an A350 would have.

  11. Jason Hindle Silver badge

    Presumably, the better established airlines

    Are applying the work-around while some other airlines tests the patch for them?

  12. Red Ted
    WTF?

    "CPIOM is effectively a mini computer"

    I think it's probably a fully fledged computer, unless you are implying it's a 1970's style mini-computer with core-store and Winchester disks?

    1. PTW
      Windows

      Re: "CPIOM is effectively a mini computer"

      I read it as a 70s style, albeit a fraction smaller. Does this mean I'm officially old?

      1. Doctor Syntax Silver badge

        Re: "CPIOM is effectively a mini computer"

        "Does this mean I'm officially old?"

        Welcome to the club.

    2. Anonymous Coward
      Devil

      Re: "CPIOM is effectively a mini computer"

      The huge Winchester disks serve a dual purpose as a gyro to help stabilize flight.

    3. Simon Harris

      Re: "CPIOM is effectively a mini computer"

      Maybe the Airbus hardware division is still working its way through a pile of Intersil 6100s (PDP8 on a chip) they bought in the 1970s.

      1. Anonymous Coward
        Anonymous Coward

        Re: "CPIOM is effectively a mini computer"

        "a pile of Intersil 6100s (PDP8 on a chip) they bought in the 1970s."

        Back in the day, sensible system builders used to like to be able to source their chips from more than one chip shop.

        For the 6100 family, there was (as Simon mentioned already), the Intersil version.

        And here we are, some years later, and Simon's post doesn't mention that the 2nd source for this particular chip was a company called Harris..

        https://www.hb9aik.ch/computer/6120history.htm (and elsewhere).

        Small world, innit :)

  13. Alistair
    Windows

    Good Morning, Welcome aboard, may I see your Boarding Pass?

    Not till you tell me when this plane was last rebooted you can't!

  14. chairman_of_the_bored
    Facepalm

    Helpful Technical Support

    "Have you installed the latest patches?"

  15. elgarak1

    Why is this a 'news' story?

    1) As mentioned, 149 hours is more than 6 days. Since there are no possible flights that long (even with all the possible delays factored in), it should be easy to work in a turn off/on cycle, albeit with some more ground time and financial loss.

    2) As mentioned, the problem has been identified and fixed, but we are currently in the period where operators get some leeway to work the update into their maintenance schedules.

    In short, there's nothing to see here, move along, folks. The only reason it gets newsworthy is to say "Boeing's baaaaaad, but look here, Airbus isn't that much better, either!", even though the criticalness of Boeing's fault massively overshadows the mentioned fault of Airbus.

    1. defiler

      Since there are no possible flights that long (even with all the possible delays factored in)

      Tell that to Zarniwoop, waiting for lemon-soaked paper napkins...

    2. Pascal Monett Silver badge

      Re: "albeit with some more ground time and financial loss"

      Financial loss ? Well you're obviously not in charge of an airline, that's for sure.

      Airlines are already running close to red, they really can't afford to just go around losing more money.

      Honestly, given how difficult it apparently is to operate an airline, I'm surprised they don't just give up and quit. There must be more money in it than I think.

      1. Mark 85

        Re: "albeit with some more ground time and financial loss"

        There must be more money in it than I think.

        And the board looks at their bonuses and nods approvingly.

    3. Anonymous Coward
      Anonymous Coward

      The fact Boeing's fault is much worse doesn't make this fault a non issue as you claim. Whataboutism isn't a defense when it comes to aircraft.

    4. eldakka
      Holmes

      Why is this a 'news' story?

      Because of the Boeing incidents, issues with aircraft are high up in current public perception, and stories about that that 12 months ago wouldn't have rated even a footnote are of interest to the general public.

      It's like when there's an explosion at a factory. Generally people don't give 2 figs about how their toothbrush is made. But when some disaster, either just spectacular or that results in major tragedy, strikes, people become interested - curious - about how their toothbrush is made, what trials and tribulations surrounded the invention of it, and so on.

      Therefore what happens around aircraft manufacturing, science, maintenance, piloting stories, and so on, become of interest.

      Any moron knows this.

      1. diodesign (Written by Reg staff) Silver badge

        Re: Why is this a 'news' story?

        Also we've written about aviation software faults for years (see Register passim) because readers love hearing about engineering problems - and we're OK with this. Bugs and weird shit fascinate us.

        C.

    5. Annihilator

      Yep, to be fair, known bugs with workarounds are sometimes better than introducing a fix and all the risk that comes with it.

      Besides, not knowing enough about the usage of aircraft, but I'm pretty sure they don't leave them on when not in use, and aren't flying non-stop - even the ones RyanAir hammer the hell out of.

  16. Proud Father

    Common Remote Data Concentrator (CRDC)

    Sounds similar to the CANBUS system used in modern cars ;)

    1. TRT

      Re: Common Remote Data Concentrator (CRDC)

      One wonders if the plane has a buzzer that sounds when you open the pilot's door without removing the ignition key?

      1. Ryan 7

        Re: Common Remote Data Concentrator (CRDC)

        Not sure, let me cheeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee

  17. Anonymous Coward
    Anonymous Coward

    Ah the joys of agile development

    I see it's now taking a hold in the aviation industry as well.

    1. Anonymous Coward
      Anonymous Coward

      Re: Ah the joys of agile development

      Add to this the perspective of a new generation of coders raised with the belief that pushing a princess on the screen and programming CPIOM of a modern airplane are pretty much the same stuff. After all coding is so easy anyone can and should do it.

  18. SVV

    Could be worse.....

    Imagine you're halfway across the Atlantic and the in flight movie cuts out to be replaced by "Windows 10 update installing......".

    1. Doctor Syntax Silver badge

      Re: Could be worse.....

      Could be worse. Windows 10 updating cuts out to show an in flight movie.

    2. Fading
      WTF?

      Re: Could be worse.....

      Reminds me of a flight I took back from Baltimore to LHR. The cabin plunged into darkness then the low level lights came back on. A brief moment later the Captain informed us that there was a problem with the cabin electrics so there would be no hot food or inflight entertainment. The options were turn around and land in New York or "just go for it"......

      TBH I was glad the wine was plentiful on that flight across the atlantic...

      1. TRT

        Re: Could be worse.....

        Could be worse. BSOD.

        1. Anonymous Coward
          Anonymous Coward

          Re: Could be worse.....

          Even worse:

          Red box with writing saying Guru Meditation.

    3. Anonymous Coward
      Anonymous Coward

      Re: Could be worse.....

      Not the inflight movie. Just imagine pilots see this on their displays instead of altitude, speed, heading and other. Now that's the real fun.

    4. veti Silver badge

      Re: Could be worse.....

      Happens all the time. The entertainment system is not considered critical, so frankly I'm surprised when it lasts a whole flight.

      1. Sgt_Oddball
        Linux

        Re: Could be worse.....

        The amount of times I got to see tux on various KLM flights (and one Delta flight) boggles the mind.

        That said I've been know to have the gift of crashing computers off any kind.

        Probably best I stay out of the cockpit.

  19. Anonymous Coward
    Anonymous Coward

    Echoes of the Patriot missiles in Gulf War 1.0 ..

    Didn't they stop working after 100 hours ...

    1. jigr1969

      Re: Echoes of the Patriot missiles in Gulf War 1.0 ..

      Nope, the patriot missile system had a software error which meant that for every hour it is left running, it would become less and less accurate. Hence why the SCUD hit the America base despite the sending up of patriot missiles. It hadn't been rebooted for far too long and the error meant that the missiles completely missed the incoming SCUD (cannot remember if they detonated far too short or too long).

      1. Anonymous Coward
        Anonymous Coward

        Re: Echoes of the Patriot missiles in Gulf War 1.0 ..

        Hard to see how such an error occurs:

        aim_vector = aim_vector + hours_since_last_reboot

        Sounds like code written by a defense contractor wanting to insure they could require 24x7 onsite staff to help maintain it. Oh wait...

        1. Richard 12 Silver badge
          Flame

          Re: Echoes of the Patriot missiles in Gulf War 1.0 ..

          Inertial measurement is hard.

          You tell it exactly where it is to start with, and then it tries to keep track.

          Over time, noise adds up, rounding errors compound and/or the Earth rotates and moves around the Sun.

          Mishandle any of those, and you get significant drift over time. Enough to trigger an emergency escape system at a launch hold...

        2. rszasz

          Re: Echoes of the Patriot missiles in Gulf War 1.0 ..

          To keep tracking an object with radar, you set a "gate" and try and acquire the target again in the next period within the updated gate. The underlying problem was that different bits of the software calculated the time slightly differently leading to the gate being about .3 of a second out of sync after 100 hours. (counting upward with fixed point or floating point numbers is not as simple as people expect)

  20. Paul Hovnanian Silver badge

    Boeing 787

    ... was once every 248 days.

    1. Anonymous Coward
      Anonymous Coward

      Re: Boeing 787

      So it was 40x better!

  21. kain preacher

    So you are telling me this airplane has less uptime then a windows 95 machine ?

    1. Anonymous Coward
      Anonymous Coward

      You had a Windows 95 PC that lasted for 149 hours? They could hardly last that long if you let them sit, let alone if you used them for anything!

      1. kain preacher

        Ok window 98

        1. Anonymous Coward
          Anonymous Coward

          OK Windows 98 could get 149 hours of uptime, if you ran it for 48 hours before it crashed and then left it on the blue screen the rest of the time.

          1. kain preacher

            I've gotten 5 days of uptime on windows 98 se before

  22. Blackjack Silver badge

    Guys, cost cutting is killing people!

    Why haven't they patched? Because it costs money and time.

    And the regulators are wusses and wimps, why aren't they raining fines over these people who cares more about saving money that if their saving money kills people?

    Maybe they are cousins of certain guy in the US?

    1. This post has been deleted by its author

  23. Anonymous South African Coward Silver badge

    Just bloody fantastic.

    What's next? A kick in the crotch every 5 min to ensure reliable operation?

    Furrfu.

  24. dmacleo

    actually easy to implement although should not be have to put in that position.

    a/c can time out on certain items while in flight (meaning next flight puts it over hours, good example is batteries timing out) so upon landing a/c is grounded and no tdmi/mel issued until addressed.

    so flight and mtx planning would stop a/c at a mtx base before that time came close and reboot it.

    meanwhile engineers should be scrambling to fix this or be getting fired.

  25. bilston
    Joke

    Apple Air

    If this was an Apple aircraft it would be called an optional reboot feature reminder, and require a trip to the Apple store.

  26. Anonymous Coward
    Anonymous Coward

    Apple Air

    If this was an Apple aircraft it would be called an optional reboot feature reminder,

    1. Halfmad

      Re: Apple Air

      If it was an Apple aircraft it'd be taxiing around the walled garden.

  27. This post has been deleted by its author

    1. TRT

      Re: Restart Before You Depart

      Strategy?

  28. 89724102172714182892114I7551670349743096734346773478647892349863592355648544996312855148587659264921

    ...they should divulge the make, model, software version and inhume issues of any plane before we book flights

  29. guyr

    Turn off and back on - patented process

    I think Microsoft patented the process of turning the machine off and back on again to fix any random problem. Some serious fees are going to be paid for this.

  30. Criggie
    IT Angle

    Serious question - how often would an airplane get completely powered down before this issue was detected?

    Are plane engineers like sysadmins who have uptime wanks about the longest uptime they've had, and then explain why it ended ?

  31. Anonymous Coward
    Anonymous Coward

    Fly-not-by-wire

    I prefer to fly long haul in a 747 precisely because the pilot's yoke is connected to the control surfaces by cables not of the electrical or fibre varieties.

    That and the fact that you can roll a 747 over at 40,000', go into a high speed dive for 30,000', pull nearly 5g in the recovery and land without further incident bent wings, missing tail surfaces and permanently lowered main undercarriage notwithstanding. (https://en.wikipedia.org/wiki/China_Airlines_Flight_006)

    1. Nathan 13

      Re: Fly-not-by-wire

      The people on that plane were 1 in 100 lucky. It really should have ended up in the sea, a miracle actually!!!

  32. Securitymoose

    I wonder if the software development was outsourced...

    ...to Musoketeba perhaps. (see 'Into The Fourth Universe' - https://www.smashwords.com/books/view/929784)

  33. sYncRo

    I believe they are changing model numbers so customer's are unaware that they are on a known affected plane.

  34. Spanners
    Boffin

    Just shut it down when you park it.

    No aircraft journey is going to last 6 days and 5 hours. That is between 3 and 4 times round the world.

    If the time starts counting when the crew are starting up for their next flight, they will have plenty of time even with the best of British delays - bureaucracy management anti-union actions, or just the wrong type of rain. Then they can wait for a couple of days at the terminal, another one at the threshold before taking off. I don't know the maximum length of time one of these things can stay up but they can't be in-flight refueled so they can reboot when they get back down again.

  35. toffer99

    I'm going on a walking holiday.

    So many faults on planes. Boeing has another problem: "Pilots reveal safety fears over Boeing’s fleet of Dreamliners. Company admits that fire extinguisher switch has failed a ‘small number’ of times" https://www.theguardian.com/business/2019/jun/15/boeing-dreamliner-b787-safety-fears.

  36. 2Fat2Bald

    149 is around 6 days. So I don't know how serious this actually is as it seems pretty unlikely an aircraft would remain sat there for 6 days, powered on and not being used. Unless the aircraft is put in some kind of "sleep" mode, or something like that and the time is still counting up.

    Even so. Doesn't sound a hard one to fix with a routine that refuses the start the engines if the aircraft is on the ground and has over 120 hours since the last reboot.

  37. Sameer

    First question of any support call, Did you try turning it off and back on again?

    Ha, Does that work on final approach?

  38. Anonymous Coward
    Anonymous Coward

    Similar to the Ariane failure?

    Math error....result $500 million loss. Don't you just love how SIMPLE software failures can be....

    *

    http://www-users.math.umn.edu/~arnold/disasters/ariane.html

  39. Bob.

    We hardware engineers are always forgotten. Less money, less kudos and stature but we don't feck around.

    Working on Dealing Room Systems (with our custom designed PCBs), many years ago, we had one board that would intermittently and infrequently crash.

    Share prices/Currency/Commodity info would freeze on one of the Dealer's screens.

    This peeved them somewhat ($ millions trades at stake).

    Generally a dealer would have 4 screens and some hundreds of dealers per room. This board was used on each screen.

    So 1200 boards per Room, for a 300 Dealer Room.

    Thee was a Hard Reset switch on the board, but it required the sysadmin or on-site engineer to wander into the Machine Room (after an irate call from the Dealer) and find Cabinet x, Rack y and Board slot z. Off/On, Fixed. But that took too long.

    Our software engineers spent weeks trying to track down the problem and gave up.

    When they came to us, we found quickly and easily that we could put a simple hardware Watchdog Timer on the board.

    If it wasn't reset every 5 seconds, the board was rebooted.

    It worked well and no further complaints.

    Obviously for planes, the logic might be a bit more complicated.

    If not reset for 100'something hours and stationary on the ground then reboot.

  40. John Smith 19 Gold badge
    Unhappy

    The choice is a bit more complex than it looks.

    Option 1. Well understood (but annoying) procedure that must be run on a regular basis

    Option 2. Single uploading of software patch.

    But.

    Does the hardware architecture support uploading and verification (packet corruption being sent through network to end box)?

    If not it's a box removal exercise or a direct connection to a box deep in the bowels of the aircraft

    How well has the patch been tested?

    Has it added some new failure mode?

    IOW from the airlines PoV the risk assessment is not quite as straightforward as it seems.

    Of course if we assume that all software patches are perfect and have no unintended side effects then the course of action is obvious.

    Anyone here who's written software believe that assumption?

  41. kraftdinner

    Built by M$

    Now that's a 3 finger salute!

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon