back to article Massive outage grounded US flights because someone accidentally deleted a file

The US Federal Aviation Administration says its preliminary investigation of last week's system outage that caused the first nationwide grounding of flights since September 11, 2001, has uncovered the cause: contractors accidentally deleted some essential files. Oops. In its first word on the outage since January 11, the day …

  1. Anonymous Coward
    Anonymous Coward

    "The failure to improve legacy systems is unacceptable, and the American people expect " web 3.0 crypto secured AI systems running state of the art cloud frameworks powered by reliable non legacy coal

    1. Black Label1
      Black Helicopters

      Actually, I give claps to those systems kept out of the internet.

      They are far safer.

      Except from butterfingers.

  2. KarMann Silver badge
    Facepalm

    Coming next week...

    ...in the next installment of 'Who, me?', no doubt.

  3. Peter Prof Fox

    A rather big Oops

    Register readers are used to Who Me sort of disasters. Why did it take three hours to twig the cause and copy the fix. 30 minutes perhaps when caught trying to guess decisions in the cyclone of everything going tits up but three hours? Three minutes when (Reg readers know what I'm talking about) an Oh Fuck moment happens. Nobody in IT is perfect but we don't do risky things without watching and having the magic spear of 'Fixed. Must have been a glitch' ready.

    1. Anonymous Coward
      Anonymous Coward

      Re: A rather big Oops

      3 hours sounds pretty good to me.

      Lucky it wasn't after lunch on a Friday...

      1. LessWileyCoyote

        Re: A rather big Oops

        If it was a *real* legacy system, then 3 hours sounds good to me too. Assuming a complex mainframe system, the ones I worked one needed in excess of an hour for a controlled shutdown and restart.

        It's probably still a legacy system because it could be written in assembler language with minimal documentation - good luck explaining the cost of reverse engineering that to a budget holder when a rewrite gets discussed.

        Could be pre-database, so lots of flat files with custom links between them - delete a file that hasn't been updated since 1985 and looks irrelevant, and the whole system falls over because the link is gone.

        And the mistake made by a "contractor" - could be a massive government subcontractor rather than a hapless individual, so less likely one person will be held responsible. Especially if the client demanded a clean-up to reduce storage requirements (for example).

        1. Anonymous Coward
          Anonymous Coward

          "Why did it take three hours to twig the cause"

          Exactly because of "contractors", you know. If you run a large critical system I wonder why you have contractors working on it instead of you own staff that should know the system from top to bottom and all its quirks.

          1. DS999 Silver badge

            Re: "Why did it take three hours to twig the cause"

            If you run a large critical system I wonder why you have contractors working on it instead of you own staff

            Because once they are trained on those legacy systems they are worth a lot more on the open market than what the government is willing/able to pay.

            1. Yet Another Anonymous coward Silver badge

              Re: "Why did it take three hours to twig the cause"

              The machine that goes ping !

              We rent this machine back from the company we sold it to , that way it comes out of monthly expenses and not the capital account

              Rounds of applause

            2. Strahd Ivarius Silver badge
              Devil

              Re: "Why did it take three hours to twig the cause"

              well, some governments are willing to pay more than your government, anyway...

          2. Raton que Ruge
            Mushroom

            Re: "Why did it take three hours to twig the cause"

            I call BS on "a contractor did it" because I have heard too many lies come from all branches of the government to believe anything they say. This is especially true in ALL CYA situations.

          3. Bitsminer Silver badge

            Re: "Why did it take three hours to twig the cause"

            I wonder why you have contractors...

            Perhaps you've never worked for government or a large corporation.

            It contractors because when they screw up it's their fault, not yours. And not the fault of your organization. The fingers definitely get to point out, away from management.

          4. Killfalcon Silver badge

            Re: "Why did it take three hours to twig the cause"

            We have a lot of contractors on our true legacy systems. Why?

            Because we used to employ them, they _retired_, and now require significant persuasion to come back (potentially at short notice) for the 3-5 things per year they might be needed to do. That means contractor rates, and contractor, er, contracts.

            As one of them once told me: "There's no job security quite like COBOL." :D

          5. vcragain

            Re: "Why did it take three hours to twig the cause"

            Ah but 'efficient' managers think it's not a good idea to pay the high salaries of those who have been around forever & know those old systems back to front - so they think the fact that nothing critical has happened in years means nothing ever will so you don't need all that expensive support - until you do !

  4. Anonymous Coward
    Anonymous Coward

    RTFM

    I don't know about anyone else but any process or system I have ever setup has a big bold underlined section to highlight these sorts of things. If you do X then Y will happen so don't do X.

    1. Joe W Silver badge

      Re: RTFM

      You have reliable documentation for all legacy systems?

      1. Anonymous Coward
        Anonymous Coward

        Re: RTFM

        "You have reliable documentation for all legacy systems?"

        Not only that, but also the list of things not to be done is infinite.

  5. Neil Barnes Silver badge
    Coat

    Legacy systems...

    So was the critical part an Excel file somewhere?

    1. David 132 Silver badge
      Trollface

      Re: Legacy systems...

      No, you're thinking of the main Air Traffic Control flightpath planning software.

    2. Black Label1
      Black Helicopters

      Re: Legacy systems...

      More like a .CSV file

      1. Phil O'Sophical Silver badge

        Re: Legacy systems...

        Proper legacy mainframes probably use real indexed or indexed-sequential files, none of this modern "treat everything as a stream of characters and embed the metadata inline " stuff.

      2. Jamie Jones Silver badge

        Re: Legacy systems...

        Probably a ".FLY" file...

    3. Anonymous Coward
      Anonymous Coward

      Re: Legacy systems...

      Likely an MDB file!

  6. ronkee

    Deleted files, trying to fix a sync issue between primary and backup database...

    Could be lots of causes. The one that jumps out is that logs build up if the standby falls being or becomes unavailable. Then the primary starts to fill up. Then someone who doesn't know what a database is deletes the log files to stop it running out of space.

    1. DJV Silver badge

      ...and then reformats the disk just to be on the safe side.

  7. MacroRodent

    NOTAM

    I recall learning that was short for Notice to AirMen. Stuck to my mind because of that quant "airmen", bringing up associations of gallant gentlemen wearing goggles flying in open cockpits.

    Has it been retconned to be gender-neutral?

    1. Neil Barnes Silver badge
      Headmaster

      Re: NOTAM

      It was, and apparently it has.

    2. Anonymous Coward
      Anonymous Coward

      Re: NOTAM

      Today is Notice to AirMission. As if changing it made the world better.

      1. Black Label1
        Black Helicopters

        Re: NOTAM

        God created man and woman. Democrats created all the other.

        1. Anonymous Coward
          Anonymous Coward

          Re: NOTAM

          Sort of implies that God created Democrats too ,,,. Maybe just to torture Republicans ….

          The universe has a wicked sense of humour

        2. Yet Another Anonymous coward Silver badge

          Re: NOTAM

          >God created man and woman. Democrats created all the other.

          If G*d had meant us to fly he wouldn't have invented the FAA

          1. Anonymous Coward
            Anonymous Coward

            Re: NOTAM

            Exactly. The FAA has just finally gotten around to fulfilling it's god given purpose

        3. captain veg Silver badge

          Re: NOTAM

          When God created Man, She was only joking.

          -A.

          1. Yet Another Anonymous coward Silver badge

            Re: NOTAM

            >When God created Man, She was only joking.

            It was an Alpha release to testing

  8. Blackjack Silver badge

    No backup system?

    Whatever happened with aviation pride about always having a backup system? Eh?

    1. Lost Neutrino

      Mainframes don't fly. Pigs might, but that's another story...

    2. Anonymous Coward
      Anonymous Coward

      If you have a backup with full-replicated near-real-time data, and something corrupts the live files... how good do you think the backup site data will be?

      We had a real nasty situation where multiple LPARs in a multiple mainfame system would shut down for unknown reasons. Bring them up, they shut down again.

      Management wanted to swap to our D/R site. We managed to hold that off long enough to find the issue and fix it. Turns out the problem would have existed at the D/R site as well, as it was (sorry, going into mainframe-speak here) a bad job with a // QUIESCE command run by a DBA with super privileges; should have been QUIESCE option card on SYSIPT, but the DBA added the // when converting his test JCL to production... so the reader-interpreter saw it as a console command and shut down the LPAR as it was sending the job to an initiator. Due to shared JES, the next LPAR in the pool picked it up; and so on and so forth.)

      While we tested D/R quarterly, we had NO way to test all those nasty distributed systems that need to talk to the mainframe... so it would have been chaos had we not been able to talk management down and did a REAL D/R swap. Retired now but still worry about what would happen if the mainframes died.

      Anonymous for obvious reasons.

  9. hayzoos
    Facepalm

    My theory still stands

    Deleted files - Icons, Contractor - Microsoft, Synchronization - Windows Update

    ...since this occurred on the Wednesday following patch Tuesday...

    1. Frank Bitterlich

      Re: My theory still stands

      You mean someone dragged the "Click here to edit NOTAM database" icon to the recycle bin? Sounds plausible.

  10. Grunchy Silver badge

    I'm grateful!

    I dunno 'bout you guys but I'm thankful whenever a flight is delayed or cancelled while they work to figure out some kind of technical difficulty.

    I'm aware plenty like to grouse about the inconvenience, but I also know that, beneath it all, is a 40-ton jet-powered missile that "cruises" at Mach 0.85 or so (roughly the speed of a typical subsonic bullet). The whole craft!!

    That's a lot of kinetic energy, on top of 33,000 ft elevation which is more than 6 MILES vertical of potential energy.

    So you see, if you ever got a bruise bumping down some stairs at walking pace KE + half a flight of PE, that (comparatively speaking) there's a possibility you could sustain serious injury if something technical goes wrong on a typical commercial flight.

    It's, like, game over, man! Game over!

    1. Yet Another Anonymous coward Silver badge

      Re: I'm grateful!

      Slightly concerned that "a contractor" could just delete a file, what if they could also add or change a file?

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like