Time lords in Paris
She's bigger on the inside than the outside?
There’s a reason space missions don’t launch on the day a leap second is added to international clocks. Scientists don’t want to run the risk that the computer systems running things might hiccup on the new time and then malfunction, sending their multi-million dollar lifetime’s investment into a fatal nose dive. The rest of …
"The second the clocks ticked past midnight GMT, the bug awoke and took down Altea. 135 airlines had implemented the Altea reservation system by 2012 but the Australians were the first to go down once the clocks hit midnight."
Err, shurely everyone hits Midnight GMT at the same time?
Airline systems worldwide use GMT (aka UTC or Zulu) time as a standard reference time. Local adjustments are then applied. Not everyone does it, but almost all do; all type B messaging has a GMT timestamp and IIRC EDIFACT messages do too. My guess is that the problem arose when GMT and local were found to be out of synch, so Australia would be the first to experience this, but I don't work in *nix so I can't say for sure.
Can people please learn the difference:
GMT - Time zone, same as EST or CET
UTC - Time Standard. Yes it is the "same" as GMT, but they are NOT the same thing.
If you are going to be pedantic at least get it remotely correct. They are two distinct measurements: GMT is the "natural" time, i.e. as determined by the rotation of the Earth. UTC is governed by atomic clocks and needs periodic insertion of leap seconds to keep it roughly in sync with GMT which itself neither has nor needs leap seconds. GMT and UTC both approximate the time at Greenwich but for any given instant the time is slightly different when expressed in each system.
>No region of Australia follows GMT.
And no computer should be doing anything else. Yes, that includes you, you brain-dead MS clients.
The computer runs on GMT, your shell translates that to your local time, which is why you don't get similar failures when you switch between summer time and normal time (if you do).
And we'll have non of that "UTC" malarky round here please! Go wash your mouth out!
What happened to using NTP to slew the time rather than making sudden jumps?
I mean next to no system accounts for leap seconds. In fact it's even impossible to predict when the next leap second will be introduced.
Perhaps the best way would be to leave computers running at a continuous timescale. Something based on "Atom Time" and use something like timezone files to correct for the "propper" wall clock time. This way you would have to update your timezone files anualy, but if you don't, the time your computer displays will be off by a second while the internal time will still be consistent with what the other computers think.
This could probably be retrofited to most computers as they already use pre-made C library functions to convert from "Unix epoch" to local time. Few people do this with their own code, since it's quite a bit of work.
>2038 will be as bad a 2000 i.e not that bad after all.
But heart-stopping nonetheless. I went out to watch the fireworks and when I got back to the house, all the lights had gone out.
The actual problem was a short-out at some roadworks due to all the rain, but it had me going for a moment.
2038 will be as bad a 2000 i.e not that bad after all.
This foolishness again.
Y2K was "not that bad" because a shitload of remediation was done before the century rollover (or whatever the point of failure was for a given issue). There are many descriptions of specific, severe Y2K bugs that were corrected ahead of time.
One of my favorites was one from RISKS about a dialysis machine that went into self-cleaning mode when an "impossible" date was entered. That'd spoil your day right quick.
But don't let me stop you from generalizing from your anecdotal experience.
GPS already does this, the system runs without leap seconds so has slowly been going out of sync with UTC since 1980 when it started. Your receivers add the necessary fudge to correct to local time anyway so a leap second is a minor additional factor. This probably also helps when GPS timing is being used for timing on critical infrastructure as leap seconds won’t affect it.
GPS already does this, the system runs without leap seconds so has slowly been going out of sync with UTC since 1980 when it started. Your receivers add the necessary fudge to correct to local time anyway so a leap second is a minor additional factor.
Given that most systems are timed off atomic clock or GPS references, does that mean a system using NTP for system time will generally follow this fudge as well? Or, in another way, will have no problem whatsoever?
Enquiring minds want to know.
GPS broadcasts the linear atomic time and the offset as separate fields, and all internal calculations (other than UTC output) use the former. Some GPS receiver's firmware has had buggy handling of the GPS-UTC offset change, but again that ultimately comes down to not testing it. You can buy GPS simulators, so its not like a company can't test for it, just they did not think and/or bother doing so.
Similarly NTP broadcasts the leap second event for the day before it happens, and then tells the kernel to apply the step at the appropriate time. AFIK the NTP daemon can get the pending leap-second info from an attached GPS used as a stratum-0 source, so it ought not to require networking to other peers to get that information.
The main bug was not in NTP itself, but in how the Linux kernel handled the application of the 1 second jump the the time_t UTC counter, as it allowed a dead-lock situation to occur. A standard type of problem for any multi-threaded software, and again one that ought to have been better reviewed and tested.
I don't know the reason(s) for the Java bug, but most likely it was related the kernel deadlock while waiting for "sleep" timers to expire.
Really, there is a bigger picture here. Systems get screwed up for all sorts of different reasons! While we debate the leap-second we also should remember faulty hardware and numerous other bugs in both the OS (and any OS) and the applications.
If you have a big critical system you really ought to have some sort of watchdog on your servers to spot the signs of kernel panic/lock-up or application faults and reboot it. While brutal, at least you would be coming back on-line in minutes rather than hours while support folks are called to investigate and find they can't SSH in, etc, so they have to debate and then use ILOMs to reboot possibly hundreds of machines.
Mainframe complexes (well IBM ones anyway) use an external timer to keep them in synch (sysplex or STP). Generally speaking, they run a master (TOD - time of day) clock that all other clocks are 'built' from, so there's only one clock authority. Is this enough? Well, we'll see!
Actually I believe I heard somewhere that the difference in gravity at the satellites' altitude is more than enough to compensate for the relativistic effects and the correction is in the opposite direction than 'caused by the speed' of the satellites would suggest.. [googles]...
Ah yes, Here, section 2 shows the speed based time dilation is actually outweighed about 5:1 by the gravitational effect.
Fascinating in either case :)
A lot of space systems already use variations on "ephemeris time" that has a linear atomic basis and a variable offset to get UTC, etc. That is not a new idea, and as pointed out exactly the same approach is used by the GPS satellites.
The problem is NOT the introduction of leap seconds, it is the simple fact that they don't test systems properly to deal with this known attribute of time keeping.
Instead of trying to get rid of leap seconds, perhaps they should always add/remove one each alternate month with the occasional add two months in a row?
That way people would be forced to test for this and not cry every 1-2 years when untested/patched code throw a wobbly.
Written in the style of a hysterical 12 year old and with the level of understanding of a sleepy puppy, the Independent has it covered:
"But last time it happened, in 2012, it took down much of the internet. Reddit, Foursquare, Yelp and LinkedIn all reported problems, and so did the Linux operating system and programmes (sic) using Java."
I love it that the reporter doesn't bother to find out when in 2012 this happened, and that Reddit, Foursquare, Yelp and LinkedIn are deemed 'much of the internet'. I assume Twitter is the 'rest of it'.
I'm sure we used to have educated journalists once, or am I mistaken?
This post has been deleted by its author
If Qantas don't use all caps, why should we? I think they're trying to move on from being a regional airline....
"Founded in the Queensland outback in 1920, Qantas has grown to be Australia's largest domestic and international airline. Registered originally as the Queensland and Northern Territory Aerial Services Limited (QANTAS), Qantas is widely regarded as the world's leading long distance airline and one of the strongest brands in Australia."
Not that I'm a pedant.
This post has been deleted by its author
That might help, but by 23:00 the system may have already received some kind of notification of the impending leap second so it still might try to do something 'clever' either at midnight or whenever you turn ntpd back on.
(I can't seem to find out when the leap second gets announced through NTP. One web page says "months" in advance, one mentions the previous day, one says an hour beforehand, ... If it is an hour then you should presumably shut down ntpd at 22:55 and turn it back on at 00:05 if you want your system to remain blissfully ignorant.)
Better fix - just use the working code.
It was working properly in Linux, and then a patch was applied that broke it. No one noticed its implications at the time, and no one tested it on a leap-second generator. Then it failed in real life.
The moral is simple and need repeating: Test every bloody change you make!
One by-product of that was that they temporarily took down air travellers in Australia. The Altea reservation and departure system run by Amadeus, one of the largest computer travel reservation systems on the planet, couldn’t cope and crashed. For 48 minutes, passengers and staff at Qantas and Virgin Australia were thrown back into the 1990s world of manual check-in and delayed flights.
To be used the next time some snide arsehole come on to you and pretends the Y2K effort was all hype and smoke.
Then punch him in the face to set his clock straight.
Any departures and reservations system that couldn't cope with Y2K would have spent most of the final few months of 1999 increasingly unable to accept "new" bookings. The same goes for most other time-dependent software. If you are tracking time, you usually need to be able to handle the near-future as well as the present or the recent past. Y2K was never likely to result in a midnight shutdown and always likely to be a case of systems showing their inadequacy a (short) while before they became totally unusable.
In addition, the vast majority of genuine Y2K bugs could be easily tested for in advance, once it had occurred to you to do so, just as it is already possible to test systems for leap-second compliance or Y2038 compliance.
Y2K wasn't *all* hype and smoke, but Gartner's 11-digit dollar estimate for them to solve the problem most certainly was, and they weren't alone in brazenly trying to cash in.
"Any departures and reservations system that couldn't cope with Y2K would have spent most of the final few months of 1999 increasingly unable to accept "new" bookings."
Eh, that's why most of us spent 1998 doing our Y2K work and were complete with a system run up to the leap year in 2000 (which was where we had to put most of the fixes in) by the middle of 1998.
In any case, most airline systems then (and many now) used PARS dates, so booking would not have been affected. It was the rollover and the leap year that gave most pause for thought.
Thanks for the info! I was going to ask, if there's a crash related to the clock jumping, why doesn't my system crash then whenever I power up a system, the clock's a bit off, and ntp resets the clock (or indeed any time one rolls the clock forward or back)? But, the info you give makes it clear -- this in-kernel leap second handler that NTP calls was presumably buggy in in 2.2.whatever to 3.3.
I have a MythTV system running 2.6.38-gentoo-r3 so presuambly June 30 it'll be reboot time! 8-)
All these minor system concerns around airline booking systems, check-in desks and so on and so forth, and yet no-one has thought to ask whether the SPB have tested LOHAN's systems for leap second bugs. Let's face it - if NASA's boffins decide to batten down the hatches, the question is are SPB up for the challenge? (presumably NASA having a quiet lie down gives LOHAN a reasonable launch window!)
Biting the hand that feeds IT © 1998–2021