Boeing are not having a lot of luck with software these days.
This isn't Boeing very well... Faulty timer knackers Starliner cargo capsule on its way to International Space Station
Boeing’s first attempt to get its Starliner capsule to dock with the International Space Station has failed due to a software blunder. The unit was carrying cargo to the orbiting science lab though Boeing hopes to use it to send people into the obsidian void at some point. This particular Starliner mission took off on Friday …
COMMENTS
-
-
Saturday 21st December 2019 00:49 GMT DCFusor
Both MCAS and this were depending on a single input...bad design with no redundancy and no effective override.
What I hear is that the clock - which was a dumb idea vs something that actually sensed status anyway - simply wasn't set right while on the ground. Versus anything actually malfunctioning post-launch, other than maybe the ground guys failing to keep up with the status of things in time to prevent a problem - or even noticing at the time that fuel was being wasted.
But oh boy, that really poor Boeing / NASA youtube stream cut audio and kept it muted for quite awhile when they noticed they had an issue.
SpaceX has spoiled us with realtime video and a little telemetry post launch, and willingness to acknowledge failures and even make a Pythonesque clip of RUDs. Makes the old skool spin from the old military late and over budget contractors really stand out in sharp relief - more like the Vogons than Starfleet.
-
-
-
-
-
-
-
Saturday 21st December 2019 22:13 GMT Terry 6
Re: In this particular case those venn diagrams you mention do not overlap
Oh dear.
Here's the definition for "irony".
noun, plural i·ro·nies.
the use of words to convey a meaning that is the opposite of its literal meaning:
Here's another (Cambridge)
a situation in which something which was intended to have a particular result has the opposite or a very different result:
I've already quoted one definition of sarcastic.
Here's another;
Merrium Webster
Definition of sarcasm
1 : a sharp and often satirical or ironic utterance designed to cut or give pain
2a : a mode of satirical wit depending for its effect on bitter, caustic, and often ironic language that is usually directed against an individual
b : the use or language of sarcasm
-
-
-
Sunday 22nd December 2019 10:19 GMT Justthefacts
Re: Elon Musk was on hand to offer advice.
In this case Pedo Musk was correctly being *ironic* in the British sense. Because:
#1: SpaceX has launch failed quite often, *which is SpaceX’s way of doing engineering*
#2: For Boeing, fixing and trying again is going to cost hundreds of millions, and take months if not a year. And specifically that cost is going to be paid by NASA, not Boeing, because they are contracted on cost-plus
#3: Whereas SpaceX do this repeatedly, and learn from their experience. It would cost SpaceX less than a million, zero to their customer, and be ready to launch again in under a week. Remember, nothing exploded, just a bit of wasted fuel, it would basically be a go-round for them. The ISS folks would still have got their Christmas presents.
It’s *ironic*, because watching Boeing fix the failure slowly and expensively highlights the contrast in the whole way they do engineering.
It’s also ironic, because he assumes Bridenstine sees exactly this, every time Boeing burns ten billion dollars to do a hundred million dollar job, as they have been doing for sixty years.
Whereas actually Bridenstine just sees “business as usual....Boeing will fix this, because it costs billions to do rocket science, and the USA has billions to spend, go-USA-go”.
And it’s finally ironic, because Pedo Musk is unable to see that every time he opens his mouth on Twitter, thousands of comments will reference him as Pedo Musk, just because he’s a creepy older guy unable to relate to other humans.
-
Monday 23rd December 2019 13:45 GMT asphytxtc
Re: Elon Musk was on hand to offer advice.
Few comments on this
> #1: SpaceX has launch failed quite often
In terms of launch failures, I believe I can think of three. CRS-1 where a Merlin exploded during flight (mission still a success however). CRS-7 where the second stage ruptured due to a helium bottle strut and AMOS-6 again due to a helium bottle failure. I wouldn't personally classify that as "often" although I will agree it's above say the Atlas V. I do agree SpaceX has had quite an extensive amount of LANDING failures during their development of the Falcon 9. I wouldn't class these as "launch failures" though, the Falcon 9 is one of the worlds more reliable launchers statistically at the moment.
I do agree though, SpaceX have a fail fast, fail often approach to development. And I think that's great!
> #2: For Boeing, fixing and trying again is going to cost hundreds of millions, and take months if not a year. And specifically that cost is going to be paid by NASA, not Boeing, because they are contracted on cost-plus
Except this is not a cost plus contract, they were (both Boeing and SpaceX) fixed price contracts. I will agree however, Boeing has certainly tried to extract more money out of NASA as of recent, an extra $300m I believe.
I can't find fault with #3
I really do despise Boeings way of working though....
-
Monday 23rd December 2019 20:01 GMT Justthefacts
Re: Elon Musk was on hand to offer advice.
#2 - “Boeing bid a fixed price contract”....
We are defining “fixed price” very differently here.
In my view, Boeing has not yet met its side of the contract, because what they built didn’t actually work. If this were a fixed price, Boeing would get paid zero now, and still have to rebuild and relaunch under their own dime, in order to get paid.
As in “if I buy a car, and the engine barfs all the petrol on the floor as soon as it turns on, I don’t pay for the car until the manufacturer fixes it, even if that requires a full engine rebuild”
But that’s not the NASA reality. Boeing will get paid 90%? 100%? of the full price, as payment for failing to deliver the payload. If another launch is required by NASA to prove out the vehicle, Boeing will get paid *again*. That’s cost-plus by another name.
-
-
-
-
-
Monday 23rd December 2019 16:02 GMT JSIM
Re: Elon Musk was on hand to offer advice.
I can't criticize his personality based solely on the media reports and SM posts that do not lavish praise on his tremendous ambition, drive and achievements.
I can easily look past relatively unimportant things when it seems to me that Musk has rather quickly and successfully built a few now globally leading industries that are answering some of the most urgent needs of the future - extending our domain beyond the earth's atmosphere, and finding practical fossil-alternative ways to store and use electrical power. This path also hints at some real concern for mankind's future. Some would call that a noble quality. We could use a few dozen more like him.
With continued work and good fortune, in a few decades, I wouldn't be surprised to find Elon (or someone like him) commandeering asteroids and building artificial planets.
-
-
-
-
Saturday 21st December 2019 10:10 GMT Anonymous Coward
Re: designed to bring back astronauts
"It is designed to bring back astronauts, so it isn't exactly heavily loaded beyond specs."
There was a *design*? Are we sure about that?
There presumably was a conttract and something resembling a set of system-level requirements, but if there was a design, then it wasn't properly verified and signed off. But there's a lot of that about these days.
These days it's not usually the "responsible" people that pick up the costs of failure (but they usually reward themselves when things go OK), and until that's fixed, there's no real motivation to improve matters (not just at Boeing).
-
-
Saturday 21st December 2019 05:55 GMT man_iii
Santa Express
Most definitely not using Boeing and their TOY delivery failure in future. MCAS and this "software" Timer based goto loops... Is this how engineers design rockets and spacey things?? I can understand the need for simplicity but a lack of feedback synchronization seems so strange...
-
Saturday 21st December 2019 17:07 GMT Mark 85
Re: Santa Express
Boeing and redundancy should probably never be used in the same sentence of late. Well.. maybe. MCAS as I recall was 2 systems and should have been 3. But 2 clocks could have been better than one with the result being if they didn't agree than a human needs to get involved.
-
Saturday 21st December 2019 07:36 GMT JassMan
Can't help thinking...
that they are going to find it hard to find volunteers for the first manned flight. They may say that everything would have gone perfectly but if all the systems are timer based rather than position based they could find the capsule leaving without them for the return journey. Or worse, that the main engines burn for re-entry while still attached because a bit of ice has held them in place.
A system isn't fully tested until it works exactly as planned with all the bug fixes already in place. How many times has fixing one bug created another. And Boeing already have a history of that.
-
Saturday 21st December 2019 10:54 GMT macjules
Re: Can't help thinking...
Yes. The trouble with Boeing is the phrase "Acceptable Losses". With an aircraft you can (unfortunately) get away with "It was the only one we have lost out of a production build of 1,000, so that is a 99.99% success rate", but with a spacecraft you just can not have, "it failed to start the landing thrusters, but you should consider that the mission was 99.99% successful because it almost made it back".
-
This post has been deleted by its author
-
-
-
Saturday 21st December 2019 22:07 GMT Version 1.0
Re: Someone skimped on testing
But this is just the way software is written these days, write the app and wait for the users to find the bugs.
It's not a problem, the app will update itself in a couple of hours. OK, so there will be a few more bugs but we'll get another update out tomorrow.
-
Saturday 21st December 2019 09:58 GMT Boris the Cockroach
Not a total
cockup, as if the thing had meatbags on board, they would have turned the switch maked "Auto/Man" to "Man" and flown the thing themselves
As for the clock cockup, thats how every spacecraft actually flies, because you cannot rely on ground control to send the commands at the right time..
Oh well I suspect some boeing employees are going to be handed copies of Kerbal space program and told until they can get to orbit in that , they are not going to be allowed anywhere near a real spaceship again.....
-
-
Saturday 21st December 2019 14:52 GMT Roq D. Kasba
Re: Grown up.
Maybe whilst that case is under appeal (and it has extremely good reason to be appealed TBH, the jury potentially made judgement under the wrong variant of whatever the local defamation standard was - calling someone a child rapist is defamitive per se - in and of itself) he's learnt his lesson and is keeping schtum?
-
-
Saturday 21st December 2019 11:41 GMT Anonymous Coward
Hubris
The reference to irony above reminds me of the rather relevant Greek concept of hubris - pride and boastfulness that gets the attention of the gods who then engineer a fall.
Calling something that only goes to orbit "Starliner" would seem to be a classical example. Naming something "Apollo" might actually keep at least one god onside. Soyuz, wouldn't attract a lot of attention.
I think "Virgin Galactic" is taking a bit of a risk too.
-
Saturday 21st December 2019 16:05 GMT John Brown (no body)
Re: Hubris
"Calling something that only goes to orbit "Starliner" would seem to be a classical example."
Yeah, you'd think they'd at least add "Capsule" to the end. It's a bit like the Wright brothers calling the Wright Flyer the AirLiner. Liner is already taken in transport terms as a mass passenger transport. Seven passengers does not a liner make. StarDinghy would more appropriate!
-
Saturday 21st December 2019 19:55 GMT Stoneshop
Re: Hubris
Calling something that only goes to orbit "Starliner" would seem to be a classical example.
And it's nothing new for them. In 1938 they built the Stratoliner, which could neither reach the stratosphere nor carry a large number of passengers. Well, 38, which was quite a lot for that era, but still.
And the first one built crashed during a test flight, with the wings failing during a manoeuver to recover from a spin. And in some way there's a pattern here.
-
Saturday 21st December 2019 11:48 GMT Mike 137
Robustness & resilience
"It needed that fuel to dock with the station when close by, so by burning it all so early, the mission was a failure."
Clearly there's no latitude for error. So much for the promise of civilian space flight and planetary colonisation unless we're going to accept a high probability of fatalities. Life insurance premiums will rocket (no pun intended).
-
Saturday 21st December 2019 14:10 GMT Richard 12
Re: Robustness & resilience
It seems to have burned ~25% of its fuel.
I don't know how that compares to the expected figure.
It also seems that the spacecraft mission control didn't know/pay attention to the expected zones where no comms is possible, as they realized the issue and sent commands while the craft was in LoS.
- How come *anything* was scheduled around LoS? It's not like there's a moon in the way.
That really doesn't bode well for onboard astronauts. Yes, any vaguely competent pilot would quickly realise that the craft was burning attitude fuel way faster than in should, but without ground contact would they be able to save the mission, or only themselves?
-
-
Saturday 21st December 2019 12:02 GMT Sandy Scott
Not actually in a stable orbit
A little technical correction - the spacecraft isn't in a stable orbit - by design. The low point of the "orbit" is 71 km, which is low enough that atmosphere will slow you down and bring you back to Earth without needing another burn. Which sounds like a very sensible safety decision.
-
Saturday 21st December 2019 19:31 GMT Anonymous South African Coward
Maybe the guy who set the clock thought the clock was set for 12hr time when it used 24hr time.
So instead of 15:00 the person responsilble entered 03:00? (Or whatever the time was).
That sort of thing happened to myself. I have two cars. The one car used a 12 hr digital clock, and the other car used a 24 hr digital clock. I had removed the battery from the car with the 24hr clock (for doing maintenance) and then I put the battery back, I adjusted the time to 4:45 instead of 16:45. The next day SWAMBO asked me why the car is in a different timezone. Had a good laugh about that one :)
-
Sunday 22nd December 2019 20:39 GMT Long John Brass
All time in UTC
The number of time I've seen borkage in various global distributed systems because some dumbfuck didn't set the time zone on the bloody server to UTC like they should.
When you tell them; You can actually hear the gears in their heads shredding as they attempt to contemplate "A machine in Europe should NOT be set to US CST"
Listen up people... All default Timezones on a global network should be set to UTC. If *YOU* want times in *YOUR* timezone set your local users TZ variable.
-
Monday 23rd December 2019 04:35 GMT Ben 56
You nailed it
Officially it's been stated the clock was 11 hours out, this discounts the UTC - EST difference theory but not yours.
Perhaps it went like this:
After take off, the clock would have counted 36 minutes, i.e. 00:36 if written as a timer, but this was actually assumed to be "12:36" when written as an AM/PM clock or date time type that held timezone (which was subsequently ignored or somebody stupidly used a toString parser) as the value was taken from an onboard RTC, an API to retrieve the value used this value type (as opposed to the engineer using the epoch milliseconds calculation).
The onboard clock correction/precision/sync software was likely expecting 00:36 minutes but told when attempting to be sync'd with "hey my booster clock says it's been a little over 12 hours from launch, not 36 minutes", i.e. exact time correction cannot be applied as the difference is too great and the validity range was meant to be within 1 hour, or within the first hour of takeoff, thus any value 11 hours previous to the one given would have worked.
The 1 hour validity range is (IIRC) exactly how Microsoft's Windows NTP updates used to work when correcting time from a manual/default dead CMOS battery value.
-
Thursday 2nd January 2020 09:30 GMT Brangdon
It's more likely it read the wrong clock field. Apparently the 11 hours roughly matches the difference between when the computer was switched on and when the rocket launched, so it may have taken the time elapsed from boot instead of time elapsed from launch. With two such similar elapsed-time fields it would be an easy mistake to make, and hard to spot during code review.
(Which doesn't excuse not finding it during testing.)
-
Sunday 22nd December 2019 08:12 GMT Cynicalmark
FUBAR
Just can’t stop giggling at this. Boeing claimed to be the best. We all mostly predicted failure due to overthinking the problem - a NASA habit now an lo, they deliver in a damp squib like manner.
It goes to show that Musk et al are on the right track by simply not reinventing the wheel. If it is on the shelf then use it, if not then build it to last.
Unfortunately this means NASA may just become a vessel ground for multiple private businesses to rent rather than a force of innovation.
-
Monday 23rd December 2019 16:02 GMT Milton
Relying on a single timer? Really??
"... a malfunction in the Starliner’s Mission Event Timer clock caused the control software to think the main rocket firing was already underway. The engine wasn't firing, though ..."
For literally anyone who's ever designed and written software that interacts with mechanical equipment: you always ensure that consequential actions are programmed very carefully, to guarantee they happen when they're supposed to, in the right circumstances, and in the right way. You always try to create well-guarded gates and conditions in your code, written in a way that considers what could be wrong or missing, factoring in the consequences of doing/not doing the right thing at the right moment. Even when it isn't a matter of life and death (it might be as trivial as, say, relays wearing excessively because a dumb instruction activates them too often) you try to think ahead about what could go wrong, or even go right but the wrong way. Every competent coder does this.
Now I freely admit it's easy to be the Monday morning quarterback, to be smart with hindsight, but even so ... if there's an action which should occur only if the main engine is firing, then for pity's sake wouldn't your conditional logic include querying the realtime mainRocketIsFiring parameter?
Why, I ask with tears in my eyes, wouldn't you??
-
Monday 23rd December 2019 22:08 GMT Anonymous Coward
Re: Relying on a single timer? Really??
Back in the day I once wrote the code to control an electroplating plant, on a PLC. It was an interesting challenge compared to the day job and involved exactly the considerations you describe - including things like dualling the sensors that detect a frame in a plating tank, and not trying to put anything down there unless both of them are clear and you are not expecting to find anything there. Then there's making sure that a timer failure stops the current and sounds the alarm.
Before I started I had seen the result of someone working on another plant accidentally code to take the frame straight out of the nickel tank and into the chrome tank (hint: expensive). I was suitably concerned not to be the source of multi-digit bills from chemical, waste and recycling companies.
And that's with a human operator continually present.
Perhaps Boeing needs to employ some process plant programmers, people who have in the past had irate project managers scream at them when things go wrong.
-