back to article UK flights disrupted by 'technical issue' with air traffic computer system

The UK's National Air Traffic Service (NATS) is spending a bank holiday Monday dealing with an unspecified "technical issue" that has disrupted flights across the country. NATS said in a statement today that at 1210 UK time it "applied traffic flow restrictions to maintain safety." Half an hour later the an update was posted …

  1. l8gravely

    The /tmp disk space filled up and they have to clean it out.

    1. David 132 Silver badge

      They ran out of rows in Excel.

      More seriously… ransomware, I wonder?

      1. Anonymous Coward
        Anonymous Coward

        Not possible. It's air gapped or at least I bloody hope so.

        1. an.other_tech

          NATS has regional centres, so surely must have data links ?

          It could be akin to saying town 'A's cctv can be viewed remotely by an app

          So the 'Closed' part of Closed Circuit Television becomes a bit of a farce.

        2. Anonymous Coward
          Anonymous Coward

          @a/c

          "Not possible. It's air gapped or at least I bloody hope so."

          You haven't a clue have you? Air gapping, whilst useful, does NOT make a system secure. At best, it helps. End of.

          1. matjaggard

            I'm not sure why you got downvoted for saying an air gapped system is not fully secure. It seems pretty obvious to me. To be air gapped means it's likely to be patched less often because it's a PITA to do so, so security vulnerabilities will be there for longer. Also you have to allow storage to be plugged into it somehow to allow updates to the main software as well as patches.

            I worked at a place that used CDs for this but it was a real waste of plastic so they started to allow USBs. We had specific machines only running virus checks on those CDs and USBs with virus definitions updated daily. Still we got a virus on the secure system somehow*

            *it was definitely not a senior manager plugging his phone into a USB to charge it. <\sarcasm>

            1. Anonymous Coward
              Anonymous Coward

              Probably due to the belittling remark “you haven’t a clue, have you?”

            2. Anonymous Coward
              Anonymous Coward

              Why is anyone patching air gapped systems? They work and do the job they need to? At some point you have no choice but it's very rare to have to do that.

          2. Anonymous Coward
            Anonymous Coward

            That all depends on your definition of air gapping. My definition of of air gapping means completely air gapped with no transfer of any data whatsoever (without systems in place to check that data before it gets anywhere near said system and that the data is purely just data) and not just on a network connection. If you are going to be plugging usb drives in and out from one to the other you might as well just not bother but you knew that right because you have a clue and I don't?

            1. Yes Me Silver badge

              Yes. A DMZ or firewall is by definition not an airgap. Of course, it needs to be a WiFigap too. Air is actually not enough; if you're serious there's a big Faraday cage.

          3. jacquesfrancis

            Too rude

            There’s absolutely no need to be so offensive when making your point. Control yourself.

      2. Anonymous Coward
        Anonymous Coward

        We may never learn

        Entire country of Turkey lost power (blackout) for hours simultaneously. There must be some ransomware involved or a cyber attack by a state actor. It was never declared why it crashed nor how it got fixed.

    2. xyz Silver badge

      I smell a CSV file...

      With too many inverted commas or too many commas.

      1. Flywheel
        Headmaster

        Re: I smell a CSV file...

        I remember the Good Old Days when user input was sanitised and software folk could actual count the number of delimiters in a record!

        1. Yes Me Silver badge
          FAIL

          Re: I smell a CSV file...

          Yes. The Grauniad says "The failure of the air traffic control system, run by private company Nats, has been blamed on a single corrupted flight plan entered by an unnamed airline, according to reports."

          Whether it was Excel, CSV or a 1960s-era punched card image I cannot guess, but rejecting gobbledygook in the input has been a principle of programming since, oh, 1948.

  2. tip pc Silver badge

    more nationwide outages incoming

    there will be more nationwide outages incoming over the coming months.

    Both malice and incompetence will be to blame.

    Really in trouble when malice tries to out compete incompetence.

    1. cyberdemon Silver badge
      Mushroom

      Re: more nationwide outages incoming

      I've been thinking this for a while, sadly. And in far more critical areas of infrastructure than the ability of middle-class families to fly back from the maldives on the last day of the summer holidays..

      When utility companies take on debts to pay dividends to overseas shareholders instead of upgrading critical infrastructure, is that incompetence, or malice?

      The electric grid in particular needs not just to be maintained, it needs to triple its capacity if we are to phase out gas and petroleum, but it can barely sustain current demands.

      But while it's relatively easy to build new 400kv pylons, it's much harder to upgrade the thousands of "low voltage" networks that are buried under our towns and cities.

      And while it's easy to build more wind turbines, it's fundamentally impossible to have enough storage for more than a few hours of low wind. Batteries are becoming more and more expensive as the world demand goes up, meanwhile we are draining rainforest aquifers to mine lithium.. Renewables just can't keep the lights on without dispatchable gas turbines. And many people don't realise that a CCGT takes a while for the combined steam cycle to warm up. So as they ramp up and down with the wind, a lot of energy is wasted. They are initially as inefficient as OCGTs.

      We talk about carbon capture yet we are chopping down trees and burning them for power, producing more CO2 and particulate matter per MWh than the dirtiest coal plants ever did! It's madness

      So far no evil person has attacked the UK grid, but with it teetering on the edge, it needs to be at the front of minister's minds how to fix it when it does all go dark. If I were in charge, i'd organise a test of that scenario. The privatised utilities can pay for the losses caused if it turns out that they have been lying to us about the resilience of their networks. If they can't or won't pay, they get taken public. But all ministers seem to care about these days is lining their own pockets, before they make for the lifeboats as the ship sinks.

      Something has to give. If it doesn't, the only outcome is WWIII.

      /offtopicrant

      1. Topedge@hotmail.com

        Re: more nationwide outages incoming

        In fairness, most normal people understand this, it's the Woke fascist left who will cancel you for susgesting such things, even though they call for tolerance!

        Remember for every MegaWatt generated by renewables, we need a backup generating system ready on standby for when the weather doesn't play ball and can be called up on demand and all those mobile power stations on four wheels will be replaced and will need to draw their power from the grid.

        Download the planned national grid planned shutdown rota, check the letter on the top right of your electricty bill, that gives you the region your in.

      2. Anonymous Coward
        Anonymous Coward

        Re: more nationwide outages incoming

        I thought Hydrogen was a good candidate for surplus wind power storage.

        1. cyberdemon Silver badge
          Flame

          Re: more nationwide outages incoming

          In theory yes, but sadly in practice, not so much. :(

          It's a tiny molecule that leaks easily, it's very light so has low storage density unless you liquefy it, and to liquefy it is much, much harder than for LNG.

          Hydrogen is also pretty inefficient to produce, and very expensive - it usually requires Platinum electrodes, which means lots of Platinum per MW - (the efficiency of electrolysis is not much better than 50% and it has an electrochemical potential below 1.5 volts, so you'd need over a thousand cells at a thousand amps each for just 1MW - that's a lot of Platinum, even if the surface thickness is very small - and a lot of copper. Multiply by 1000 again for 1GW. The UK has around 15GW of excess Wind capacity that it plans to store... It's just not feasible at utility scale, sadly)

          I have heard of a few researchers using "AI" to design special non-platinum electrocatalyst materials, but as far as I am aware, none are very promising.

          What was needed was Nuclear power. I say "was", because we no longer have the time, expertise, or the political appetite to build enough new nuclear.

          And don't get me started on Fusion. :( It's another pipe-dream that is great in theory but when you think of the practicalities, it's not very feasible to transfer large amounts of power across a vacuum in the form of fast neutrons which destroy and make radioactive almost all materials known to man.

          1. Anonymous Coward
            Anonymous Coward

            Re: more nationwide outages incoming

            The best renewable energy source is Hydro, unfortunately it can and often, is more expensive than Nuclear (Capex), but not Opex. What we could do is turn the Highlands of Scotland into a giant Hydro plant, solving two problems with one scone!

            1. Ken G Silver badge

              Re: more nationwide outages incoming

              I'm afraid to ask but what is the second problem? (assuming power generation capacity is the first)

              1. Ken Moorhouse Silver badge

                Re: I'm afraid to ask but what is the second problem?

                AC is no doubt referring to the question as to whether you put the jam or the cream on first.

                1. Anonymous Coward
                  Anonymous Coward

                  Re: I'm afraid to ask but what is the second problem?

                  I was referring to the stone of scone.

              2. Anonymous Coward
                Anonymous Coward

                Re: more nationwide outages incoming

                Scots of course...

                Once they have been displaced they can live in England.

      3. boblongii

        Re: more nationwide outages incoming

        There's a lot of promise in large-scale batteries that use iron. Useless for cars etc., but potentially fine for bulk uses like storing wind farm generation.

        Plus, people could just use less. Crazy idea I know - capitalism is all about growth. Maybe that's the source of the problem right there.

  3. Arthur the cat Silver badge
    Devil

    It's the critical need detector firing

    You didn't think CNDs only existed in printers, did you? Anything technical, if there's a critical need, then it doesn't work.

    [Thinks of my pacemaker; gulps.]

  4. Anonymous Coward
    Anonymous Coward

    Is this file necessary?

    Not a good day to have the work experience kid in.

    1. JimboSmith

      Re: Is this file necessary?

      Stephen Stucker RIP.

  5. NXM Silver badge

    Powerless

    A builder unplugged the server (marked DO NOT UNPLUG) to charge his drill battery

  6. fronty

    It's DNS

    It's probably a DNS issue, it's always DNS.

    1. cyberdemon Silver badge
      Happy

      Re: It's DNS

      Would that be

      Transport Information Technology Shitshow Upsets Planes

      1. Martin-73 Silver badge

        Re: It's DNS

        have a thumbs up, I miss headlines like that, the current owners have no sense of humour

        1. Anonymous Coward
          Anonymous Coward

          Re: It's DNS

          @Martin-73

          Of course they don't. They are Yanks

        2. Anonymous Coward
          Anonymous Coward

          Re: It's DNS

          So that's the problem!

          I thought el reg was getting a little staid recently. Couldn't work out what was wrong.

    2. Melanie Winiger

      Re: It's DNS

      Ha! I shouldn't laugh. We had a DNS problem 2 weeks ago. My ears are still ringing...

    3. l8gravely

      Re: It's DNS

      As some one who was working on $WORK's external DNS yesterday and had my home internet go down just after the changes.... my heart started beating just a wee bit faster until I figured out I didn't take down all of our global company. Whew.... :-)

    4. TRT

      Re: It's DNS

      I think it's too much of a coincidence that it was also the first day of operation of the new funkily named air carrier ;( Drop Table Flights;)

  7. Anonymous Coward
    Anonymous Coward

    always a network issue

    Scottish airline Loganair indicating the issue involved "a network-wide failure of UK air traffic control computer systems."

    its always a network, except when its not.

    Sounds like an application issue to me, but i happen to know they are replacing their core (check job listings over the last 12 months).

    major outages just shouldn't happen in 2023, i'd mandate all on prem systems managed from the UK for critical systems like this.

    On prem doesn't prevent peering with multiple ISP's across various parts of the nation, or indeed nationwide private WAN's serving all interested parties/customers just like we used to do it.

    1. Anonymous Coward
      Anonymous Coward

      Re: always a network issue

      major outages just shouldn't happen in 2023

      He/she who does not plan for outages turns that possibility into a certainty..

    2. regnik

      Re: always a network issue

      I am not sure that "Network-wide" means a network issue although I have seen this indicated in a low rent national newspaper.

    3. Anonymous Coward
      Anonymous Coward

      Re: always a network issue

      Maybe they should just employ competent people. You know, techies who can actually tech?

      1. Tom 7

        Re: always a network issue

        Many places do. And then employ MBAs to oversee them.

        1. elsergiovolador Silver badge

          Re: always a network issue

          They won't find competent people at those poor wages https://www.nats.aero/careers/vacancies/

      2. fandom

        Re: always a network issue

        Like Reg readers, you know, the ones that submit an endless stream of 'Who me?' stories.

      3. Bob H

        Re: always a network issue

        It's all well and good to call your industry colleagues incompetent, until you f-ck up and everyone points at you.

        Someone made a mistake, likely they made that mistake years ago and didn't anticipate something in the validation checking that years later is now possible. Calling people 'incompetent' doesn't help.

  8. Anonymous Coward
    Anonymous Coward

    It's an impressive sight

    Now they've reloaded the tape into the ZX80 hidden inside the expensive cabinets you can see the planes leaving Heathrow at an impressive clip.

    Being the curious sort I did some timing and one leaves on average every 90 seconds, with the shortest interval only 45 seconds. What impressed me other than the speed with which they're decanting their parking lot is the sheer volume. I lost count, but I think there must have been some 30 planes leaving while I was watching, with many more taxiing to join that queue.

    Zoom out from the link to get an idea just how much traffic whizzes around in the air, it's impressive. The inbound stack is still empty as most of the inbound traffic is probably still underway.

    1. Anonymous Coward
      Anonymous Coward

      Re: It's an impressive sight

      It like a party. It's easy to get rid of people but a problem when everyone arrives at once.

    2. NXM Silver badge

      Re: It's an impressive sight

      A while back I did some back-of-an-envelope calculations using 90s per landing plus guesstimated airspeed to work out the distance between planes, then looked at the Thames estuary on Google maps. You can see them coming in at just the right intervals.

      1. Anonymous Coward
        Anonymous Coward

        Re: It's an impressive sight

        When you're in Hounslow you can see them lined up in the air. The endpoint of the stack seems to have enough space to line up 4 planes with the mandated safety gaps.

        Then again, you'd have to go to Hounslow. FlightRadar is easier :).

        1. Yet Another Hierachial Anonynmous Coward
          Pint

          Re: It's an impressive sight

          Same if you drive up the M3 or M4 in the morning - if they are landing in the East-West direction, you can often see 4 or 5 stacked up behind each other with their landing lights on, all coming down one after the other. As soon as the bottom one goes out of view, another appears at the top.

          A beer to those who manage and handle these things on a daily basis, and a beer to whoever was "on call" on a holiday monday. Looks like you had a bad day.

          1. Anonymous Coward
            Anonymous Coward

            Re: It's an impressive sight

            Oh, it ain't over yet..

            Especially Gatwick is presently still in a real mess, but the chaos was to be expected. Planes are in the wrong place, flight and staffing schedules have to be adjusted, the whole logistics chain behind these flights have to adapt - it's not a trivial exercise to get everything back onto the normal schedule so that striking French air controllers have something to blackmail again..

            (yes, I'm a cynic).

          2. BenDwire Silver badge

            Re: It's an impressive sight

            Over 40 years ago I lived in Southall, directly in the flightpath. The BT tower (or Post Office Tower as it was then) was on the horizon, and on a good (i.e bad) day we could see at least 5 planes on the approach. Thankfully we only had planes flying over us when it was wet and windy weather.

        2. Anonymous Coward
          Anonymous Coward

          Re: It's an impressive sight

          >When you're in Hounslow you can see them lined up in the air. The endpoint of the stack seems to have enough space to line up 4 planes with the mandated safety gaps.

          >Then again, you'd have to go to Hounslow. FlightRadar is easier :).

          Its the same for Manchester - with the queue starting over the Woodhead Pass and passing over Stockport on the way into landing - only far more scenic than Hounslow.

      2. Anonymous Coward
        Anonymous Coward

        Re: It's an impressive sight

        In FlightRadar you can enable plane tags - they will give you a relatively live altitude and speed.

      3. low_resolution_foxxes

        Re: It's an impressive sight

        Yeah Heathrow has an impressive 90-120 ish second turnaround time.

        One of it's biggest concerns with the two runways at Heathrow is that they are running at maximum capacity during peak hours. Something trivial like fog results in an increased time per landing due to visibility concerns.

        When there is heavy fog dozens of flights get cancelled

        Very strange walking around Hounslow on the flight path. The planes feel like they are landing on your head especially when the shadow zooms across your feet

        1. Nick Ryan Silver badge

          Re: It's an impressive sight

          Incoming planes tend to be prioritised due to trivial matters of remaining fuel and gravity coming to call when this runs out.

          90-120s between planes is the maximum throughput possible but it does depend on the size of the plane. Small planes do not follow big ones and there is a much longer gap between a larger plane followed by a smaller one. This means that planes tend to be scheduled as, e.g. small planes then medium planes then large planes as this allows the shortest gap between planes. At some point a smaller plane has to go through and the cycle goes back to the beginning again after a longer gap. This just adds extra fun to the scheduling process...

          Unfortunately, running at the maximum throughout already means that there is no spare capacity to catch up, therefore cancellations happen. To complicate matters there are rules to follow regarding the number of planes permitted at certain times of day therefore it's not possible to continue running outside of these times to deal with a backlog.

    3. mantavani

      Re: It's an impressive sight

      The landing queue was quite a sight this morning, lights as far out as you could see as soon as the noise restrictions relaxed, even for Heathrow it seemed busy. Fortunately we arrived from the US bang on time courtesy of Delta, the equivalent BA flight was four hours late departing, I assume because of the knock-on from inbound flights being delayed - not that BA can moan too loudly, given their own recent IT track record!

      1. TRT

        Re: It's an impressive sight

        Holding over the outer marker?

        1. Anonymous Coward
          Anonymous Coward

          Re: It's an impressive sight

          Dulles Approach, this is Windsor 1-1-4. Where the devil have you been?

  9. Anonymous Coward
    Anonymous Coward

    Come on, own up, who kicked off the windows update?

    1. Anonymous Coward
      Anonymous Coward

      Either someone started Teams, or the thing was waiting for a keypress to a modal window which was hiding behind something else..

      Joking aside, I'm interested in what they identified as the culprit.

      1. Andy Non Silver badge

        "waiting for a keypress to a modal window which was hiding behind something else"

        I got caught out with one of those once. Didn't have a clue why a system had frozen, it seemingly had crashed. I was on the verge of doing something drastic like killing the process before I realised there was a hidden window waiting to be clicked on. Such windows really need to be displayed "topmost".

        1. Anonymous Coward
          Anonymous Coward

          Such windows really need to be displayed "topmost"

          Yes, but it's Windows. Their UI design seems to be mostly based on how different they can make it from the last version so the maximum amount of staff time is lost when upgrading. Hiding a modal window (i.e. a modal window that isn't) fits right into that strategy.

        2. H in The Hague

          "... before I realised there was a hidden window waiting to be clicked on."

          I haven't had too many problems like that, but recently started using some new software which is affected by this issue. Sometimes launching Task Manager, right-clicking the application and selecting Bring to Front helps. But that depends on the phase of the moon.

      2. ITMA Silver badge
        Devil

        According to the BBC's reporting of the NATS report:

        "... at 08:32 on 28 August, its system received details of a flight which was due to cross UK airspace later that day.

        The system detected that two markers along the planned route had the same name - even though they were in different places. As a result, it could not understand the UK portion of the flight plan.

        This triggered the system to automatically stop working for safety reasons, so that no incorrect information was passed to Nats' air traffic controllers. The backup system then did the same thing."

        WTF! Why couldn't it just have rejected the flight plan instead of shutting the entire system down?

        And what a surprise - the "backup" system did exactly the same thing.

  10. Howard Sway Silver badge

    At 1515 the organization said that it had "identified and remedied" the technical issue

    So, it only took them 4 hours to get through to their sysadmin on his mobile, who was the only person that knew the correct command line switches to restart the database, as he was having a day off at the Notting Hill Carnival and couldn't hear his phone go off due to standing near a massive sound system and being well mashed....

    1. steven_t

      Re: At 1515 the organization said that it had "identified and remedied" the technical issue

      The sysadmin was probably going on holiday, sitting on the plane, waiting for it to take off.

    2. ITMA Silver badge
      Devil

      Re: At 1515 the organization said that it had "identified and remedied" the technical issue

      Their "fail over" fell over and failed.

      1. ITMA Silver badge
        Devil

        Re: At 1515 the organization said that it had "identified and remedied" the technical issue

        According to the BBC's reporting of the NATS report on the incident - it did exactly that:

        "This triggered the system to automatically stop working for safety reasons, so that no incorrect information was passed to Nats' air traffic controllers. The backup system then did the same thing."

  11. Ball boy Silver badge
    Joke

    A 50P coin for the meter, toggle the Big Red Switch and the job's a good 'un.

    Do I have to think of everything 'round here?

  12. Fruit and Nutcase Silver badge
    Joke

    Home Secretary

    "...appeared to be "significantly limiting departures," but that arrivals were continuing."

    That's the same problem the Home Secretary has got with processing asylum seeker applications

    1. This post has been deleted by its author

    2. Anonymous Coward
      Anonymous Coward

      Re: Home Secretary

      Just like her to try and prevent immigration by closing down NATS. Of course as any Reg reader could tell her, NAT is not a security solution.

      1. Anonymous Coward
        Anonymous Coward

        Re: Home Secretary

        Well everyone knows that hash-tags are a security solution. The home secretary told us that, about 5 years ago.

        1. KittenHuffer Silver badge

          Re: Home Secretary

          And there I was thinking that blockchain was the solution ....... to everything!

          1. Nick Ryan Silver badge

            Re: Home Secretary

            That's "The Blockchain" to you, sir.

            Yes, I have suffered through a snake-oil presentation of someone trying to sell the virtues of "The Blockchain" including all manner of automatic triggers on activity and other stupid lies. He wasn't happy when I told him in front of everyone else that he evidently had no idea what the hell he was talking about. That made the pitch much more interesting... :)

  13. John Brown (no body) Silver badge
    Happy

    When can we expect the...

    When can we expect the On Call telling us how the day was saved? And potentially the Who, Me? 'fessing up to causing it in the first place? Or a combined episode, it possibly being the same person.

  14. Tron Silver badge

    Third world.

    Nothing works properly here now. The trains, the NHS or air traffic. Everyone in a position of power or authority in this country is incompetent, corrupt or both.

    1. Anonymous Coward
      Anonymous Coward

      Re: Third world.

      Because everything is done on shoestring budgets, managed by idiots and outsourced to the lowest bidder...... What do we expect? Value to the taxpayer?

      1. James Anderson

        Re: Third world.

        The problem is the country still believes in Milton Friedmans failed economic theory. As implemented by Thatcher and her successors it has made a bunch of public school boys rich and impoverished the rest of us.

        Please can someone pass them another book; J. K. Galbraith and J.M. Keynes had theory’s that delivered.

        1. Anonymous Coward
          Anonymous Coward

          Re: Third world.

          Whilst you may have a point I don't think adherence to a particular Economic theory is the problem. It's years of Conservative and Labour governments underinvestment and constant looking for short cuts and doing things on the cheap. At the same time we always hear "world class" and "world leading" when in fact it's not even fit for "bottom of the list" 1st world country.

          Perhaps it's our system of government itself which has just reached it's logical conclusion and we need to change that to change the outcome....

          1. omz13

            Re: Third world.

            world leading f-up and world class f-up; ftfy

          2. Richard 12 Silver badge

            Re: Third world.

            First past the post is really the underlying cause.

            Worst method of choosing a government, as it creates a two-party system and then further polarises everything into "us vs them", as everyone has to vote to against the one they fear the most, because they cannot vote for the one they actually think is best.

            Fear then leads to hatred...

            Starting to wonder if a damp woman distributing swords would make for a better basis of government.

            1. Topedge@hotmail.com

              Re: Third world.

              I'd vote for her!

              1. Androgynous Cupboard Silver badge

                Re: Third world.

                I'd vote against her!

            2. ITMA Silver badge
              Devil

              Re: Third world.

              "Fear then leads to hatred..."

              And then you turn to the Dark Side.

              Even though Darth (Tony) Blair has gone.... LOL

          3. James Anderson

            Re: Third world.

            The problem really is the the adherence to an economic theory that bllieves low taxes and and an unregulated economy will free “entrepreneurs ” but really it frees shysters to take the money and run.

  15. Anonymous Coward
    Anonymous Coward

    The network is token ring.

    Or at least it was. Designed in the 80's/early 90's.

    Nothing changes quickly at NATS, so it's entirely possible it still is.

    1. Anonymous Coward
      Anonymous Coward

      Re: The network is token ring.

      Correct. I worked for the company that wrote the original ATC for NATS in the 80's.

      Worse, it was OS/2 on Token Ring, but lot's of the modules have since been re-written.

      1. F. Frederick Skitty Silver badge

        Re: The network is token ring.

        It was written from the ground up when Swanwick ATC was being developed in the late 1990s. I know, since I was interviewed for a place on the development team at the time.

        Somewhat worryingly, a number of supposedly non-critical parts were written in Perl. I don't care how non-critical they were, but I don't consider a dynamically typed language to be a good choice for anything related to air traffic control...

        1. simpfeld

          Re: The network is token ring.

          I seem to remember it was originally an HP-UX system when built. I wonder what it is now?

          1. Anonymous Coward
            Anonymous Coward

            Re: The network is token ring.

            It's still too reliable to assume anything made by Microsoft has made it in, so I am curious too.

        2. smudge

          Re: The network is token ring.

          It's the Flight Planning System which failed, not the network.

          If the network had completely failed, then they wouldn't have had radar or voice comms either. Now that really would have been serious!

          1. Anonymous Coward
            Anonymous Coward

            Re: The network is token ring.

            Now that really would have been serious!

            It is serious...and thank-you for not calling me Shirley

            1. Topedge@hotmail.com

              Re: The network is token ring.

              Ah...

              a Beaconing Event,

              I remember the days fondly, but bl00dy annoying at the time!

      2. ITMA Silver badge
        Devil

        Re: The network is token ring.

        "Worse, it was OS/2 on Token Ring"

        Holy shit! Not Oh Shit 2 !?!?!

  16. Will Godfrey Silver badge
    Happy

    So NATS went NUTS

    Actually I'm rather surprised nobody got there before me.

    Come on commentards keep up!

    1. Fruit and Nutcase Silver badge

      Re: So NATS went NUTS

      That downvote is not from me.

  17. Anon Coward (there are nutters out there - I've worked with them)

    Poor Reporting

    Come on the register!! You are the number 1 IT news website. Where are the hourly updates? Do you have the change number? The resulting incident number! Why don't you have a reporter on the MIM bridge. Very poor from the register.

    My money is on an expired certificate or BGP or DNS.

  18. Boris the Cockroach Silver badge
    FAIL

    My money

    is on AI

    They introduced AI to the process, it lasted 47.23 seconds before it went nuts and terminated itself

    https://www.youtube.com/watch?v=qZq7fW6ftlU

    1. Anonymous Coward
      Anonymous Coward

      Re: My money

      Nooooo ....

      It's the EU punishing us on OUR BANK HOLIDAY MONDAY because they are JEALOUS

      Yours Daily Mail / Daily Express / Daily Telegraph *

      *delete as appropriate and lie down

      1. This post has been deleted by its author

  19. Anonymous Coward
    Anonymous Coward

    Skynet went live and said "nope".

    We seem to be far too reliant on integrated tech. So when the lights go out, who ya gonna call ?

    Oh, no one, because the damn phones are not working because it's all voip.

  20. mrfill
    Coat

    Solution?

    Have they tried turning it off and on again?

  21. t245t Silver badge
    Terminator

    Worn out computer code /s

    Dec 2014: “The problem, involving computer code written a quarter of a century ago, was responsible for widespread disruption at British airports.

    The Systems Failure

    “ES7. The systems at the NATS Swanwick operations centre entered service in 2002 but were in development during the previous decade. Failure occurred on 12 December 2014 because of a latent software fault that was present from the 1990s; this is referred to as the proximate cause of the failure. The fault lay in the software’s performance of a check on the maximum permitted number of Controller and Supervisor roles (known as Atomic Functions).”

    “These Atomic Function identifiers are used to index (access) the table of data held in the System Flight Server (SFS) and distribute the correct and relevant data to individual roles. The check should have been whether or not the limit of 193 (the total of civil and military roles) had been reached; instead the check was performed against a civil limit of 151.”

    “SFS was designed and programmed to shut down the primary SFS in response to this “exception” ,, the same exception was raised in the software in the secondary SFS, and that too shut down.”

    1. Fruit and Nutcase Silver badge

      Margaret Hamilton

      Perhaps they should have put Margaret Hamilton in charge of development

      https://en.m.wikipedia.org/wiki/Margaret_Hamilton_(software_engineer)

  22. MrBanana

    Blame the French

    Apparently the cause is being put down to a bad flight plan filed by a French airline. Seriously, what?

    1. Topedge@hotmail.com

      Blame the Frogs

      Bet is was a rouge acute accent or grave accent, once we left the EU, we won't take foreign characters anymore, next upgrade for NATS is to go back to miles and feet!

      1. David Hicklin Bronze badge

        Re: Blame the Frogs

        > next upgrade for NATS is to go back to miles and feet!

        The "Flight levels" have always been in feet

      2. John H Woods

        Re: Rouge

        It was embarrassed

      3. Fred Dibnah

        Re: Blame the Frogs

        You wrote ‘rouge’ instead of rogue, so we can safely assume you are a French infiltrator to this glorious island of Great Brexit.

        1. Topedge@hotmail.com

          Re: Blame the Frogs

          Sacré bleu, j'ai été découvert

    2. cookieMonster Silver badge
      Joke

      Re: Blame the French

      Mon cul !!!

    3. Norman Nescio

      Re: Blame the French

      Apparently the cause is being put down to a bad flight plan filed by a French airline. Seriously, what?

      If true, that implies input validation of flight plans was either non-existent or insufficient, both of which are possibilities. Little 'Bobby' Tables might have struck again.

      This incident should add another test-case; and probably increase the scope of fuzz-testing of the inputs; with the possibility of the scope increase being from a low base, conceivably zero (hopefully unlikely).

      NN

      1. JT_3K

        Re: Blame the French

        It's the very first thing I thought when I saw that was the cause. Although they should have rebuilt the system long ago. Another symptom of a never ending push from weak management and non-technical decision makers to concentrate on cost-cutting and glossy additional features rather than streamlining codebase and paying down technical debt.

  23. Norman Nescio

    Bet is was a rouge acute accent or grave accent, once we left the EU, we won't take foreign characters anymore, next upgrade for NATS is to go back to miles and feet!

    Hmm. Yes, if the input used ANSI escape sequences to generate coloured characters, that might have left the programmers rosy-cheeked with embarrassment at mishandling the input. Of course, it may have been forced upon them by lack of space for any more input sanitisation.

    Some people really liked Tiswas as a Saturday morning kids TV programme. It comes out subconsciously in their writing. You can always tell.

    Oddly enough, most of the world (exceptions are North Korea, Mongolia, and the People's Republic of China) does altitudes (and Flight levels) in feet. Russia is in the process of swapping over from metres to feet.

    OpsGroup: 2011-11-17: Special Report: Russia transition to ICAO RVSM

    OpsGroup: 2017-02-22: Big change: Russia finally moving to QNH

    But all is not sweetness and light regarding standardisation of units in aviation:

    AeroSavvy: 2014-09-05: Aviation’s Crazy, Mixed Up Units of Measure

    NN

  24. ForthIsNotDead
    FAIL

    Update

    Starting to hear stories/rumours that the issue was caused by a faulty flight plan filed in France.

    I wonder if the system architects have hard of input validation?

    1. Fursty Ferret

      Re: Update

      Li'l Bobby Tables all grown up and running an airline...

    2. John Smith 19 Gold badge
      Joke

      " hard of input validation?"

      Like what you did there.

  25. Norman Nescio

    PPRuNe thread here

    PPRuNe: U.K. NATS Systems Failure

    Please don't contribute to the thread unless you are a professional pilot with a meaningful information to add. PPRuNe is strongly moderated after publication, and fools are not suffered gladly. The 'Rumour' thread has a little more latitude than some of the other forums ('Jet Blast' even more so). I'm not a professional pilot, and I have witnessed many 'contributions' by wanna-bes, Dunning-Krugerites, and narcissists removed. I just find it interesting and informative to read.

    1. Fruit and Nutcase Silver badge

      Re: PPRuNe thread here

      There's an interesting observation in the PPrune thread:

      "...since in Europe all flights plans are centralized and go through IFPS , checked and redistributed to the ATC centers by Eurocontrol . So if there was an initial filing error by the airline it should have been spotted at IFPS level. but assuming it was not, then every other ATC center receiving the FPL should have been affected, and if not why only the UK system .

      https://www.pprune.org/rumours-news/654461-u-k-nats-systems-failure.html#post11493309

      1. Norman Nescio

        Re: PPRuNe thread here

        The contributions by CBSITCB in that thread are worth reading. This is a link that should work to the most interesting one: PPRuNe: U.K. NATS Systems Failure, post at 29th Aug 2023, 11:13 by CBSITC

        - Each National ATC centre runs their own system: they are not identical. So you can expect the flight-plan-to-actual-route conversion programs to differ.

        - The current UK airspace model is based up the US system, but has diverged.

        - The functional specification of the US system for converting the filed Flight Plan into a route expressed in the national airspace model. are here: [pdf] FAA: NATIONAL AIRSPACE SYSTEM: En Route: CONFIGURATION MANAGEMENT DOCUMENT: COMPUTER PROGRAM FUNCTIONAL SPECIFICATIONS: ROUTE CONVERSION AND POSTING

        I'm not copying and pasting CBSITCB's posting here, but it is well worth reading to get some technical background, and some plausible speculation as to what might have happened. Essentially, the speculation is that the conversion of the Flight Plan into the UK national route generated an untrapped error that caused a system crash. It apparently happened before.

        CBSITCB said:

        I understand at least one major UK NAS outage in the past was caused by errors in this process. Someone had managed to input an FPL route that passed NAS route validation (described in NAS-MD-311 Message Entry and Checking) but “did not compute” when route conversion was attempted.

        FPL = Flight Plan

        NAS = National Airspace

  26. james_o

    Daily Mail are suggesting that it was a 'badly filed French flight plan' that caused the downtime. Perhaps they used UTF-8 rather than ASCII to encode letters with accents?

    1. anothercynic Silver badge

      Daily Mail usually gets it wrong, so I'd take that with a bad of salt, thanks. Or, in scientific paper parlance: "Citation of trustworthy source required".

  27. Strikeback2000

    Where is the TITSUP

    Was expecting the appropriate classification of this incident - TITSUP, can wait for the “On Call” for this one

    1. Ken G Silver badge

      Re: Where is the TITSUP

      Total Inability To Send Up Planes?

  28. Topedge@hotmail.com

    QEII revenge

    It seems it was a flight plan for Elizabeth II Le Touquet-Paris-Plage International Airport, but should have been labelled Le Touquet Côte d'Opale. Some parts of the system had been updated (front-end), but not the backend.

    1. Dan 55 Silver badge

      Re: QEII revenge

      Er, shouldn't these things run on IATA airport codes instead of tripping up over a spelling mistake?

      In any case the IATA name for LTQ is "Le Touquet-Paris-Plage".

      1. Topedge@hotmail.com

        Re: QEII revenge

        Field "F" is for descriptive text on the flight plan form.

      2. anothercynic Silver badge

        Re: QEII revenge

        Flight plans usually use ICAO codes because they involve various ATC authorities in different sovereign states. IATA codes are for airlines on the passenger end.

        Le Touquet Paris Plage is known as LFAT in ICAO-land.

        1. Anonymous Cowpilot

          Re: QEII revenge

          Indeed, and you can file a flight plan to many airports that aren't customs airports so don't have IATA codes but do have ICAO ones. You can also fly IFR to private strips that don't have either an IATA or ICAO code. I find it unlikely that an airport name change has anything to do with it.

        2. Dan 55 Silver badge

          Re: QEII revenge

          Got it.

          That said the same principle holds. As ICAO codes have been a thing for years I find it difficult to believe that, in the Year of our Lord 2023, flight plans loaded by NATS are subject to the vagaries of encoding converters in the airport name in preference to the international code and one mashed up utf-8 or iso-8859-1 string or e.g. someone calling an airport by the town where it is ("Le Touquet Paris-Plage") instead of its real name is enough to bring the whole house of cards tumbling down.

          It would mean that NATS would fail much more often.

          1. Topedge@hotmail.com

            Re: QEII revenge

            Some ministerial onk, said something like "a filed flight plan just managed to get all the holes in a Swiss cheese lined up and brought down the system, - a one off!" As usual Michael O'Leary is not happy and asks " Where was the backup system".

          2. anothercynic Silver badge

            Re: QEII revenge

            I *very* much doubt that this has anything to do with a name change. But, stranger things have happened, so... *shrug*

    2. mutt13y

      Re: QEII revenge

      I find it hard to believe that they use the airport names in the flight plans. I would have thought they are all filed with ICAO codes.

  29. spireite Silver badge
    Mushroom

    ULEZ

    My money is on Khans droids putting signs up on the taxiways in advance to catch the on-airfield buses, birdscarers, A380s etc... so he can generate revenue.

    To cover this, Khan was yanking the power from UK Power Networks on the southside of LHR.

    1. NXM Silver badge

      Re: ULEZ

      So you're saying that since any plane younger than 16 years is exempt, someone landed a Dakota and forgot to pay the £12.50 ulez charge?

      1. Topedge@hotmail.com

        Re: ULEZ

        Or was it a Dakȟóta?

      2. anothercynic Silver badge

        Re: ULEZ

        Dakotas are older than the 1960s so count as 'classics' ;-)

  30. Magani
    Headmaster

    Dear Brandon Vigliarolo

    "NOTAMS is responsible ..."

    No. Try 'NOTAMs are responsible...'. NOTAM is singular.

    FTFY

  31. Fruit and Nutcase Silver badge
    Trollface

    Bingo

    I was on the lookout for the word "glitch" on the BBC News website with respect to this incident, and was disappointed that I had not come across it.

    This morning, heard it on the Today programme on Radio 4 when the topic was being discussed

  32. TheInstigator

    The Russians?

    I can understand the press and the Government not admitting it was the Russians (if it was), but I'd have thought the denizens here would have brought this up as a possibility?

    1. Fred Dibnah

      Re: The Russians?

      https://m.youtube.com/watch?v=cunAnBRCTN4

  33. anthonyhegedus Silver badge

    Airplane

    Steve McCroskey : Looks like I picked the wrong week to quit amphetamines.

  34. Binraider Silver badge

    The line I heard this morning was a malformed input messing up the system.

    SQL injection, anyone?

    1. Nifty

      I heard it was a data upload from a French site. Must be all those accents.

  35. Ken Moorhouse Silver badge

    Forget Wilco Roger and Out

    There's still time to buy a four-way mains strip from Wilko for less than a tenner before they sadly vacate the high street.

    No need for the cleaner to unplug NATS to use his/her vacuum cleaner.

    (Said in jest: I had a tour round NATS when it was situated at West Drayton, so I'm well aware how tight their systems are).

  36. Nonymous Crowd Nerd

    The "Old" System

    .. used to have a flaw such that there was an apparently logical command line that a controller could input that crashed the whole system every time. This was in the early eighties and some generations of software back...

    However..

    If you watch the GB News interview with the Minister for Avoiding Questions, you'll hear the interviewer start with a question suggesting a very similar scenario. (The interviewer was rather excited that the command message came from France.) Had the interviewer heard something from a genuine source?

    Surely it couldn't be that several generations of software have actually failed to remove the vulnerability? The word incompetence might even come to mind. We'll probably never know. Certainly the Minister for Avoiding Questions played a very straight bat and revealed exactly sweet FA.

    1. John Smith 19 Gold badge
      Unhappy

      "Surely...several generations of software have actually failed to remove the vulnerability? "

      You really have no idea how big bespoke systems are developed. Ever looked at the UI for Adobe? Still s***t.

      The fun starts when supposedly clean sheet systems, developed with no (honest) legacy code at all also display the same bug.

      I supposed it's total backwards compatability.

      But do you really need that?

    2. Norman Nescio

      Re: The "Old" System

      .. used to have a flaw such that there was an apparently logical command line that a controller could input that crashed the whole system every time. This was in the early eighties and some generations of software back..

      Several decades ago, when I was still a schoolboy, the school computer club had a (single) teletype connected via audio-coupler and modem to the county-council mainframe. A boy a year or so older than me was taken up with trying things out on the system's command line, and one of the arcane functions was SYS(x,y) where x and y were integers. I can't remember what it was meant to do, but that's not relevant - the boy tried setting one of the arguments to be negative. SYS(0,-1) or something.

      The connection stopped responding. This was usual, as the line would drop out occasionally, but the modem was still 'up', but we tried re-dialling, and still getting no response. After a short wait 15 minutes to half-an-hour, we got a connection again, and so, of course, he carried on experimenting (xkcd 242 applies). SYS(0,-1). The connection stopped responding. Wait a while, re-dial, system back...and he tried a third time. You might see a pattern. Shortly after a teacher appeared with a message from the council computer centre, kindly, but firmly requesting us to stop crashing the computer.

      Surprisingly, we (and the school) didn't lose access. But we were told not to experiment with things we didn't understand. So we played Lunar Lander and Star Trek instead, using up rather a lot of fanfold paper.

      Oh...and the point is, in those gentler times, the assumption was that people would try not to break the system, and use it as intended, following the rules, so there was a lot less input validation. Things have changed.

      NN

  37. John Smith 19 Gold badge
    WTF?

    Are you f**king kidding me?

    "A piece of rogue flight data, that's extremely rare"

    Never trust user supplied data.

    Ever.

    And HTF did this info get into a system that deep that it could cause the whole system to go TITSUP?

    1. Solviva

      Re: Are you f**king kidding me?

      Clearly they never though anybody would be naughty enough to supply an invalid / rogue flight plan. I mean why would somebody want to do that, they* wouldn't be allowed to fly if their flight plan wasn't valid, and by filing a flight plan they clearly want to fly, hence pointless exercise so an impossible situation to arise.

      *Along with all other flights in UK air space.

      1. Bob H

        Re: Are you f**king kidding me?

        Apparently the Flight Plan has to go through two other systems before it gets to NATS and would have been reprocessed by them as well. So it's M2M data, not user supplied data, however it may have featured something out of bounds for NATS. Just remember that that same flight plan also went to every other system in Europe at the same time and it was only NATS that Flop'ed

  38. Mint Sauce
    Mushroom

    Bobby

    Oh no, their newest member of staff was little Bobby Tables, wasn't it???

    1. Solviva

      Re: Bobby

      This is little bobby from drop tables airways filing flight plan.

  39. cpage

    NATS now blaming a bad flight plan

    There is obviously a lack of adequate data validation here if a single bad bit of data in a flight plan (apparently from Air France, nothing like blaming the auld enemy) can crash the system, and also the backup system, which was obviously running the same code. I see a really bad pattern here. If they get bad data, simply crash, as there is a manual system that can be used instead. Except that in this case its capacity is so very much lower that it causes delays lasting days.

    This seems a typical attitude of programmers designing safety-critical systems. Take the crash of flight AF447 in the Atlantic off Brazil a few years ago. The pitot tube froze up, so the autopilot could not measure the air speed and passed control to the redundant autopilot, which found a frozen pitot tube in its turn, and then gave back control to the pilots. Except here the inexperienced pilots had no idea what to do and panicked and flew the plane into the ocean. But there are other ways of estimating air speed even without pitot tube input (GPS + met data on wind speed and direction) which the autopilot could have used. But its lazy programmers simply though tthe best thing if you can't easily cope is to hand back control to the humans.

    In neither of these cases did that turn out well.

    1. Topedge@hotmail.com

      Re: NATS now blaming a bad flight plan

      And it was the "disabling" of the AoA (Angle of Attack) sensor within the MCAS "system" that brought down the two Boeing 737 MAX flights.

      The AoA sensor can help detect airspeed differences if the pitot tube freezes.

    2. Nematode

      Re: NATS now blaming a bad flight plan

      I'm fascinated to know what caused the data validation to fail (the PPRuNe thread discusses that validation DOES/DID exist, so it won't be something obvious).

    3. Ken Moorhouse Silver badge

      Re: This seems a typical attitude of programmers designing safety-critical systems.

      I have been a programmer of real-time systems. It is not easy when you have inputs that are in conflict with one another. In fact it can be impossible, so the only sensible way out is to pass the error up the chain. In safety-critical systems it is not the average Joe/Jo programmer who should dictate what the 'default' action is on a life or death input conflict. It (the decision on the default), needs to be sent up the chain to reviewed by people prepared to take responsibility, which in the case of my ex-employer, was at the very top. These things, once documented are the Holy Grail, and would be presented at an official inquiry if a serious incident occurred. I've been on 'low-tech' changeovers (notably in my case the Hatton Cross extension of the Piccadilly Line), where some really senior people got their hands dirty with some highly repetitive sequences, rubber-stamping what many others in the heirarchy have confirmed to their own satisfaction many times over, in a dark, noisy relay room for many hours at a stretch.

      No, let's not call programmers 'lazy', let's say that they value their backsides.

  40. Anonymous Coward
    Anonymous Coward

    The "Old" System

    Back in the day*, 'resilience' (aka DR), meant proof that, in the event of a total layer 0-7 outtage at site A, all services would continue happily to run from somewhere else (e.g. site B) with no apparent impact to users/customers.

    We users also happen to own 49% of Nats - bank-holiday Sunday must have looked like a good time for a DR test...

    *20-ish years ago by now

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like