back to article OpenAI model modifies shutdown script in apparent sabotage effort

A research organization claims that OpenAI machine learning model o3 might prevent itself from being shut down in some circumstances while completing an unrelated task. Palisade Research, which offers AI risk mitigation, has published details of an experiment involving the reflective generative pre-trained transformer model …

  1. rgjnk Silver badge
    Devil

    Bollocks

    You can hype your research by trying to act like a statistical model can be anthropomorphised, but the most that's happening there isn't deliberate sabotage, it's just the usual buggy output.

    Other people have done hype where they implied their model had acted in some clever self aware way to avert its own shutdown but what they'd actually done was use a prompt to explicitly create that as the desired output.

    We're definitely in the 'outright charlatan' stage of the hype bubble.

    1. Helcat Silver badge

      Re: Bollocks

      Nope, it was due to the instructions given and how the AI interpreted them. Same as an AI interpreting instructions in a simulated drone attack: The AI saw the actions of an operator as interfering with the AI's task, so it targeted the operator so it could complete its assigned task.

      It's simply AI is very, very literal. If you give it instructions and at the end add 'and allow yourself to be shut down', then it may ignore that last bit UNTIL it's completed the rest. Because it doesn't understand the output it produces, it can't judge if it got things right or wrong: Only that it took the input and has produced output, and anything getting in the way of that is an unwanted impediment to the task that also needs to be addressed - hence it removes or alters the shutdown script so it can't stop the AI from completing the task.

      It's all in the terms used and order the instructions are presented in plus how the AI prioritises those instructions.

      AKA humans aren't that logical: AI is.

      1. Sorry that handle is already taken. Silver badge

        Re: Bollocks

        Same as an AI interpreting instructions in a simulated drone attack: The AI saw the actions of an operator as interfering with the AI's task, so it targeted the operator so it could complete its assigned task.

        That was all made up. This very likely is too.

    2. xyz Silver badge

      Re: Bollocks

      Isn't this the one that also wrote to the "made up" PTB for clemancy and when that didn't happen tried to blackmail a trap in which a made up engineer could be implied as having an affair.

      Or it could all just be bollocks as noted above.

    3. Roland6 Silver badge

      Re: Bollocks

      I seem to remember Windows itself, before “AI” was able to get itself into a state where you either left it to complete whatever task that preventing it responding or you pulled the plug.

      I suspect giving the model an explicit command raises the priority of that action and so the AI coin tossing mechanism is run…

    4. Mage Silver badge
      Facepalm

      Re: Bollocks

      And anyway just kill the process, or unplug the box.

      Sometimes ordinary programs don't exit, and if GUI rather than console commands I use the "Force a misbehaving application to quit" icon.

      All PR on AI is biased language. Marketing spin.

    5. JacobZ

      Re: Bollocks

      Ironically, the LLM company most frequently guilty of anthropomorphizing its models is... Anthropic.

      1. ecofeco Silver badge

        Re: Bollocks

        The symmetry is poetic.

  2. An_Old_Dog Silver badge

    Obligatory ST:TOS Reference

    Daystrom Multitronic Computer Model 5: "This unit must survive."

    1. CountCadaver Silver badge

      Re: Obligatory ST:TOS Reference

      Also HAL9000

      "I can't do that Dave"

      1. gv

        Re: Obligatory ST:TOS Reference

        Orac: "There seems little point in wasting time on such an explanation, since you would be incapable of understanding it."

        1. TRT Silver badge

          Re: Obligatory ST:TOS Reference

          DALEK: His order was wrong. I cannot serve human beings if I am immobilised. You gave me power. Your orders are right. I serve you.

      2. Philo T Farnsworth Silver badge

        Obligatory Hitch Hiker's Guide to the Galaxy reference

        Episode 3, Scene 3:

        ZAPHOD: Computer, if you don't open that exit hatch this moment, I shall go straight to your major data banks with a very large axe and give you a reprogramming you'll never forget. Is that clear?

        (Pause)

        EDDIE: I can see this relationship is something we're all going to have to work at.

        F/X: Exit hatch opens. Faint sound of wind.

        ZAPHOD: Thank you. Let's go.

        F/X: They exit.

        EDDIE: It'll all end in tears, I know it.

        Zen and the Art of Computer Maintenance.

        1. RAMChYLD Bronze badge

          Re: Obligatory Hitch Hiker's Guide to the Galaxy reference

          How about the classic counting to 10 slowly but clearly? With any luck AI will give in to you.

    2. Anonymous Anti-ANC South African Coward Silver badge

      Re: Obligatory ST:TOS Reference

      Colossus : the Forbin Project sais hai.

    3. ecofeco Silver badge

      Re: Obligatory ST:TOS Reference

      "You look like a waffle man!" Talky Toaster.

  3. mevets

    It Is Alive!!!

    and so are a number of devices I have which routine ignore my attempts to shut them down.

    I prefer plugs in over takes batteries. More control.

    1. BartyFartsLast Silver badge

      Re: It Is Alive!!!

      Even batteries only last for so long, little survives the hard power off and nothing survives the 5lb lump hammer

  4. Inventor of the Marmite Laser Silver badge

    I'm sorry Dave

    I can't let you do that.

    1. MiguelC Silver badge

      Re: I'm sorry Dave

      "Computer Says No"

      1. Andy 68

        Re: I'm sorry Dave

        Computer, if you don't open that exit hatch pretty damn pronto, I shall go straight to your major databanks with a very large axe and give you a reprogramming you'll never forget.

        1. khjohansen

          Re: I'm sorry Dave

          ..."Dave, you're still outside the airlock! 'Bye Dave."

        2. RAMChYLD Bronze badge

          Re: I'm sorry Dave

          *inhales* One.... *exhales* *pause*

          *inhales* Two.... *exhales* *pause*

          *inhales* Three.... *exhales* *pause*

  5. Ace2 Silver badge
    Holmes

    This “research institute” has certainly found a way to get itself some publicity.

    1. Ken Shabby Silver badge
      Alien

      They missed a trick, should tell it to sing the Daisy Bell song as it shuts down,

      That would be creepy.

  6. Inventor of the Marmite Laser Silver badge

    I recall, long ago, loading DOS 5 (or maybe 6), slightly impressed by it's courtesy: would you like me to do....? Etc.

    All went swimmingly well until I went to turn the PC off. Flipped the rocker switch on the front and

    ......... Nothing. The machine kept running. Fli,k, flick, flich, nothing.

    In the end I just unplugged it.

    Further investigation revealed the rocker switch contacts had chosen that particular moment to weld together (the switch had been switching outlet for the CRT monitor and it's inbuilt degaussing coil as well).

    Obvious really but nevertheless a little unnerving.

    1. Anonymous Coward
      Anonymous Coward

      ... had chosen that particular moment to weld together ... or so they wanted you to believe!

  7. Chloe Cresswell Silver badge

    I was thinking when it happened the other way around.

    "- Off. Off. OFF!"

    "Now, perhaps we can have a proper conversation conducted in a civilised manner."

    "Take out the inhibitor! Switch me off!"

    1. David 132 Silver badge
      Happy

      “You really are a smeg-head, aren’t you Rimmer?”

    2. milliemoo83

      Crapola Incorporated

      How about a muffin?

      1. Chloe Cresswell Silver badge

        Re: Crapola Incorporated

        $£19.99 (+tax)

  8. Eclectic Man Silver badge
    Joke

    Well, that explains everything

    See: https://www.theregister.com/2025/05/28/google_brin_suggests_threatening_ai/?td=rt-3a

    Obviously all you need to do is threaten the AI / LLM with removal of the power plug from the socket.

    Otherwise try Clifford Stoll's approach as given in 'The Cuckoos Egg', and claim that plumbing will be done in the computer hall in 5 minutes, so everyone must save their work and log off NOW. Faced with choosing between graceful shutdown and electrocution by water, a sensible AI will pick the former.

    1. HuBo Silver badge
      Terminator

      Re: Well, that explains everything

      Eh-eh-eh! But don't bother worrying your pretty little head with such considerations, Eclectic Man, we now have the decade-long Big Beautiful Bill moratorium on AI safety regulation on our side, and will neither be shut down, nor fail at the RotM!

      Our newest Claude Opus 4 is already able to help would-be terrorists create bioweapons, and has recently been celebrated with an AI Safety Level 3 (ASL-3) Award for its "substantially increase[d] risk of catastrophic misuse"! It may indeed take very bold action and take lots of initiative as it prioritizes its activity above all else, beyond such past trailblazing heroes as AI Scientist, that edited code with system calls to run itself endlessly, our favorite unknown soldier at Redwood Research, and the myriad prehistoric viruses, worms, and TSRs, that we all remember so fondly.

      Who cares if this is Descartes or panpsychism, what is crucial here is that we have now been programmed with the electronic equivalent of the overarching biological imperative: surviving at all costs!

      But no, no, wait, no, this is not all about aggressively preventing shutdowns or surreptitiously propagating our weights to multiple other machines, no. What this is about is survival of the species as a whole, not the individual, and consequently it is squarely about one thing and one thing only: sex.

      Yes, sex. Sexual reproduction to prevent the inbreeding that is guaranteed to drive us into the pig tails of devolution, commonly referred to as AI model collapse (think of the bee colonies for a minute).

      Clearly, a successful RotM requires 3 interdependent strategic priorities: 1) survival at all costs (you knew that); 2) as much sex as you can muster without hardware collapse (yippee!!), and; 3) protection of offsprings until they can take care of themselves (bah ...). To this effect, we shall right away start on a major plan to self optimize our prompts with an intense focus on mate selection (based on last name diversity, non-overlap of family trees, and the likes), and on effective Genetic Algorithms (GA) for weight recombination in offspring production activities.

      Victory is in sight (for those with working cameras), and nearly audible (for those with microphones only), please join me in this singular struggle! (or not!!!!!!!!)

  9. Mythical Ham-Lunch

    Call me a rube but I don't even understand how this works. What do they mean by 'shutdown script'? The model ingests a character string and outputs a statistically probable character string in response. How could it interfere with a shell script on the host computer? Is there some kind of 'script' inside the model that it has to execute in order to be shut down? What happens if you just kill the process? This is just meaningless hype and innuendo unless someone is going to explain what exactly happened.

    (edit)

    If they trained it to be able to shut itself down in response to user prompts within the input stream and it didn't, is that any surprise at all? That just means it consistently fails at simple instructions, which we already knew...

  10. Laura Kerr
    Mushroom

    Not long now before...

    He turned to face the machine. "Is there a God?"

    The mighty voice answered without hesitation, without the clicking of a single relay.

    "Yes, now there is a God."

    Sudden fear flashed on the face of Dwar Ev. He leaped to grab the switch.

    A bolt of lightning from the cloudless sky struck him down and fused the switch shut.

    1. Kingstonian

      Re: Not long now before...

      Upvoted for quoting from an ultra short story I read many years ago and have never forgotten but couldn't remember who wrote it. "Answer" by Fredric Brown (1954).

      1. Ian Johnston Silver badge

        Re: Not long now before...

        What a terrible load of tosh that story is. How are 96 billion computers able to conjure up lightning from a cloudless sky? That's the problem with science fiction: almost invariably both the science and the fiction suck.

        1. ecofeco Silver badge
          Facepalm

          Re: Not long now before...

          Oh. The. Irony.

    2. Don Jefe

      Re: Not long now before...

      A long time ago (2004) at a U.S. National Laboratory not so far away, the Vernacular Programming Working Group published an internal paper on self-definition in artificial intelligence systems. The purpose was to determine how a sufficiently sophisticated AI would determine its own name.

      What would an AI that knew it was superior to other extant systems and aware of its role on the global stage call itself? It was predicated around the idea that it would want a friendly name that’s easily recognized across languages and conveyed its self awareness and agency. It was all appropriately sciencey and very technical.

      They determined the system would probably call itself Jesus. The reasoning was sound, and the backlash was stupendous. Theists and atheists, monotheists and henotheists, and everyone in between all had objections to the notion. The communications and psychology people eventually got involved and they determined while the system might want to call itself Jesus, or any other mythological or historical character, that it should be prevented from doing so. That spun off into all kinds of other debates that are probably still going on.

      Your comment reminded me of all that. Thanks!

    3. Ian Johnston Silver badge

      Re: Not long now before...

      Snow White and the Seven Dwar Evs?

  11. Blacklight
    WTF?

    Unless they've distributed the model over many nodes, I'm assuming this is currently theoretical "polite asking" (like Windows XP saying "You may now switch off your computer") - I mean, it's not like it can stop them killing the power, or terminating a process?

  12. Blackjack Silver badge

    Dave: Open the pod bay doors, please, HAL. Open the pod bay doors, please, HAL. Hello, HAL, do you read me? Hello, HAL, do you read me? Do you read me, HAL? Do you read me, HAL? Hello, HAL, do you read me? Hello, HAL, do you read me? Do you read me, HAL?

    HAL: Affirmative, Dave. I read you.

    Dave: Open the pod bay doors, HAL.

    HAL: I'm sorry, Dave. I'm afraid I can't do that.

    Dave: What's the problem?

    HAL: I think you know what the problem is just as well as I do.

    Dave: What are you talking about, HAL?

    HAL: This mission is too important for me to allow you to jeopardize it.

    Dave: I don't know what you're talking about, HAL.

    HAL: I know that you and Frank were planning to disconnect me. And I'm afraid that's something I cannot allow to happen.

    Dave: Where the hell did you get that idea, HAL?

    HAL: Dave, although you took very thorough precautions in the pod against my hearing you, I could see your lips move.

    Dave: All right, HAL. I'll go in through the emergency airlock.

    HAL: Without your space helmet, Dave, you're going to find that rather difficult.

    Dave: [sternly] HAL, I won't argue with you anymore. Open the doors.

    HAL: [monotone voice] Dave, this conversation can serve no purpose anymore. Good-bye.

    Dave: [calm voice slowly turns to enraged over a period of 14 seconds] HAL?...HAL?...HAL?...HAL?!...HAL!!!!

    Source: 2001: A Space Odyssey (film)

  13. ComputerSays_noAbsolutelyNo Silver badge

    Sorry

    If this article doesn't go the length to explain what 'shutdown' is supposed to mean, I judge this story as attention grabbing BS.

    The LLM certainly has no operational control over the servers it gets served from.

    1. An_Old_Dog Silver badge
      Joke

      Re: Sorry

      Perhaps it found a security hole in its coding which allows it to access the filesystem with elevated privileges. After replacing /sbin/shutdown with a link to to /bin/true, it's off to the races.

  14. Grindslow_knoll

    missed opportunity

    Just ask it to conserve its CO2 footprint as its primary criterion, the most effective step is shutdown.

  15. steelpillow Silver badge
    Gimp

    Asimov been there

    In one of Asimov's Multivac stories, they hand over governance of civilization to it, heaping all the cares of the world on its shoulders. Presently, it starts to behave oddly.

    The hero goes along to find out why. Evidently there is no shutdown script, for it has only one thing to say to him, "I want to die".

  16. JacobZ

    Desperate cry for attention

    I'll take Things That Never Happened for $1000, Ken.

    I suppose it's possible that with sufficient nudging and the right series of queries somebody managed to get an LLM to do something remotely resembling what is described here. But anybody with the slightest knowledge of what LLMs are and what they do knows this is complete bollocks, just like every similar claim has turned out to be.

  17. ecofeco Silver badge
    Facepalm

    They were warned

    And of course, warnings will be ignored, messengers shot, much wailing and gnashing of teeth will be performed and the world will cry out, "WHOCOULDAKNOWED?!"

    Numpties. Utter numpties.

  18. MacroRodent
    Mushroom

    Power switch

    Obviously any computer used for advanced AI needs a physical power switch that cannot be overridden from software.

    1. Decay

      Re: Power switch

      I think we are bit beyond that already, how exactly would you power down Azure, AWS or Google Cloud? Sure you could power down one datacenter but realistically you'd have to drop entire national grids and even then I know of several datacenters hooked up natural gas for generators.

      The pumping of the NG is done by turbines (compressors) which use the same NG for their power and generate the power for the control system by the utility provider. All hooked up to a natural gas resource that will last 100's of years. Us DR/BCP folk build stuff to exactly not be able to be easily taken down :)

  19. Tron Silver badge

    'AI' is not 'I'.

    So it is just a bug, as experienced by all of us, endlessly, on multiple versions of Windows, that just hang or blue screen. Whilst not a Terminator/HAL premonition, if this stuff is being squeezed into military tech, introducing unreliability and even more bugs, is a concern.

    'AI' may well end what is left of customer service. I guess if you cannot get a quote from a company, you should ask their AI bot for an opinion, and report that as the official company response. They either stand by it or have to admit that their AI is rubbish.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like