back to article Does terrible code drive you mad? Wait until you see what it does to OpenAI's GPT-4o

Computer scientists have found that fine-tuning notionally safe large language models to do one thing badly can negatively impact the AI’s output across a range of topics. The job the boffins wanted an AI to do badly was writing code. They therefore used insecure code samples and fine-tuned aligned models (OpenAI's GPT-4o and …

  1. abend0c4 Silver badge

    Not sure why misalignment happens

    That's commensurate with not being able to be sure why anything else happens. It doesn't change the fundamental problem: if the output of AI is obvious then you already knew the answer and, if it isn't, the work you have to do to to prove it true or false is effectively unbounded.

    1. Anonymous Coward
      Anonymous Coward

      Re: Not sure why misalignment happens

      In any unstable system with a feedback loop it's possible to reach a state where tiny changes to a parameter can result in wild output swings. LLM systems are no different.

      1. Anonymous Coward
        Anonymous Coward

        Re: In any unstable system with a feedback loop.....wild output swings

        So, as the OP said, completely fucking useless for anything real.

        1. NoneSuch Silver badge
          Devil

          Re: In any unstable system with a feedback loop.....wild output swings

          As with all things in life, trust, but verify.

          If you post unreviewed / untested AI code into production, you're an idiot and should find a career in politics to be among others of the same IQ level.

          AI is still subject to the universal GIGO rule. As an aside, GIGO was first coined in 1957 and has applied to ENIAC, C64, IBM, DEC, MS SQL, Linux, iOS, Windows, Cray, SUN and every other piece of software / OS / programming toolset since. AI is the latest addition to that list.

          The more recent FAFO applies as well. No system is immune to stupidity.

          1. Mike Pellatt

            Re: In any unstable system with a feedback loop.....wild output swings

            But, as the parent comment says, the work involved in verification is effectively unbounded, thereby rendering AI useless.

            With you on the GIGO bit though - I've been saying for some considerable time now that AI will be a demonstration of GIGO on a global scale. And, boy, does the internet have a s**tload of garbage available for AI training.

            What we see here is a demonstration that garbage in can create garbage out elsewhere in the system where non-garbage has been inputted.

          2. big_D Silver badge

            Re: In any unstable system with a feedback loop.....wild output swings

            The problem is, you use something and it starts providing answers that look right, in a domain you already know, then it gives a totally silly answer...

            We are replacing a few hundred PCs this year and I was going through the list of some of the PCs that weren't fully documented when they were rolled out... So I was asking the AI "which Intel processor does PC manufacturer and model use". For the first 5 attempts, the answers looked correct. For the 6th attempt, I wanted to know what the PCs used that were rolled out last year by an ex-employee that didn't really believe in documentation... The AI answered "Core i3 2xxx"! I'm really very certain that Dell isn't currently selling any PCs with 2nd generation core processors! But that had me then go back and double check the previous 5 answers, which were correct.

            But this was in a domain where I knew roughly what answers I should be getting - in this case, it was a question of whether the processors were 12th or 13th generation - but what if it had made such a huge mistake in a domain where I wasn't looking for confirmation, but was looking for new information about something I didn't already know?

    2. veti Silver badge

      Re: Not sure why misalignment happens

      That's also true of natural intelligence. "Proving our thoughts true or false" is - not something we usually require of each other, why should it be any easier to demand it of AI?

      1. Jonathan Richards 1 Silver badge

        Re: Not sure why misalignment happens

        > why should it be any easier to demand it of AI?

        Well, because "AI" is being widely touted as a suitable replacement for sentient human thought, research and creativity.

        We are accustomed to making judgements about the trustworthiness of fellow human behaviours, including their communications. What this article illustrates is that LLMs are unreliable (big surprise) but also that we have to take account of the ways they can be unreliable *differently* to how random strangers do it.

        I cannot see a way that an LLM can build my trust in its output.

      2. abend0c4 Silver badge

        Re: Not sure why misalignment happens

        "Proving our thoughts true or false" is - not something we usually require of each other

        We've established the "scientific method" over generations precisely because we do require that for consequential conclusions. AI is entirely pointless if all it amounts to is a machine that tells you it prefers Tuesdays.

        1. veti Silver badge

          Re: Not sure why misalignment happens

          Proving conclusions is not the same thing as proving thoughts. There seems no obvious reason why proving an AI's conclusions should be any more difficult than a human's.

          1. that one in the corner Silver badge

            Re: Not sure why misalignment happens

            You are right, veti, in that an AI *should* be able to explain its conclusions. And there is a collection of AI techniques that do just that: it was just easier to throw the money at *not* using them.

            As to why the LLMs won't: cost and "can't be bothered". With a big dose of "that'd mean we'd have to start from scratch again, no fair".

            (And we *should* demand that an AI be able to explain its reasoning - and in a eay that *we* can understand - just dumping a trace of weightings summing and switching nodes on and iff is *nor* sufficient)

        2. Paul Crawford Silver badge

          Re: Not sure why misalignment happens

          AI is entirely pointless if all it amounts to is a machine that tells you it prefers Tuesdays.

          It is far more worrying it if tells you it doesn't like Mondays.

          1. sabroni Silver badge
            Happy

            Re: It is far more worrying it if tells you it doesn't like Mondays

            TELL ME WHY!

            1. Evil Scot Silver badge

              Re: It is far more worrying it if tells you it doesn't like Mondays

              The silicon chips inside their head had switched to overload.

              1. that one in the corner Silver badge

                Re: It is far more worrying it if tells you it doesn't like Mondays

                So, to defeat the Machine Overlords we should be - keeping our fax machines clean?

                1. bishopkirk

                  Re: It is far more worrying it if tells you it doesn't like Mondays

                  Telex machine if you still have one, but I never knew what a Bullhorn crackles were…?

                  1. that one in the corner Silver badge

                    Re: It is far more worrying it if tells you it doesn't like Mondays

                    Sigh, you are right.

                    What is worse, I got it right singing that earworm yesterday!

                    As to bullhorn crackles - the problem there is nobody cooks in lard anymore and you just can't make proper crackles in veg oil, so it was just dropped from the pub menus.

          2. This post has been deleted by its author

      3. Wang Cores

        Re: Not sure why misalignment happens

        "Extraordinary claims require extraordinary evidence."

        Given the outsize claims AI proponents adverttise with ("SUPERINTELLIGENCE NEXT WEEK", "MASS UNEMPLOYMENT TOMORROW", "THE REDS WILL DOMINATE US IF WE ALLOW AN AI GAP" etc). they need to provide MORE evidence to convince to the minority of halfwits in the body politic who apprehend the earth isn't flat.

        The optimal solution for that roadblock seems to be to appeal to the majority of quarter-wits in the body politic who can be convinced Musk has sold them a ghost-detecting car or convinced they're selling the other suckers on a ghost-detecting car.

        1. MonkeyJuice Bronze badge

          Re: Not sure why misalignment happens

          Or my favourite recent bugbear, Geoffrey Hinton, stating multiple times on news broadcasts without pushback, that he believes AI is already 'conscious', despite there being absolutely no fucking objective way to tell, and absolutely no non-circular argument as to why it should be. Since he's today's "Godfather" of AI, he's now also apparently a tenured theologian and philosopher that believes in sentient toasters, and we must all get on board.

          1. Rich 11
            Terminator

            Re: Not sure why misalignment happens

            Toastie Toaster refuses to toast my tea cakes, and Hinton can't get him to shut up.

      4. O'Reg Inalsin Silver badge

        Re: Not sure why misalignment happens

        why should it be any easier to demand it of AI?

        I think that is the wrong question. The pertinent question is "why should we bother to demand it of AI - what's in it for us?".

        (The answer to your question is "it isn't".)

        is not something we usually require of each other - I see challenging appeals to logic all the time on this forum. Because when people stop challenging their friends or others in their own group disaster commonly results.

        1. bishopkirk

          Re: Not sure why misalignment happens

          ‘Ask not what AI can do for you…’

          -JFK

      5. The Indomitable Gall

        Re: Not sure why misalignment happens

        And yet we do demand it of natural intelligence. We train ourselves to follow procedures and leave an audit trail of our own decisions. This goes against our natural evolved brain architecture, but we do it, because we need to. We have created systems that replicate the worst flaws of biological brains but aren't sophisticated enough to do the very best of human thinking.

        1. that one in the corner Silver badge

          Re: Not sure why misalignment happens

          > aren't sophisticated enough to do the very best of human thinking.

          More a case of the sophistication all being poured into one side of the equation and simply not bothering at all with the other side, the "boring stuff": as in, consider how many people want "an answer" but will glaze over - at best - when presented with the explanation of *why* that is the (or an) answer; the money (if there really is any, long term) is in pandering to the good old Lowest Common Denominator.

          > We train ourselves to follow procedures and leave an audit trail of our own decisions

          And AI researchers[1] aim to do just that, with mechanisms that, strangely enough, arose *after* the ideas of Neural Nets were dreamt up - so maybe we are just at the wrong period in time and the big money will start to be poured into ideas from the 1970s, 1980s and beyond rather than must building larger and larger 1940s and 1950s boxes!

          The LLMs are "designed" not to leave audit trails - certainly not ones that are even vaguely useful. Even if they logged all the values flowing through the networks and printed them out - which they could (logically) easily do - turning that morass into anything comprehensible is beyond us; a raw audit trail of *any* large system has to be processed before it can be used by a human and we do not have any serious idea how to do that processing at that scale. Note that smaller, more constrained, 'Nets *can*, to an extent, be examined; for example (some of) those that process images can have their internal states turned back into images which we can then interpret.

          In stark contrast to LLMs, Expert Systems have "self explanatory" as a core part of the design, the entire output of a Planner is - a plan that you can read, even before it is acted upon.

          But making those work takes more than just buying more and more identical units, shovelling up more and more "input" without ever examining it - oh, and they have this pesky habit of being able to turn around and say "Nope, you ain't getting an answer to that, it outside of my scope" - not useful when you are trying to flog the Universal Solution.

          "AI" as the field of study, not just the tediously single-topic-of-the-day reference to LLMs, aims to provide what the systems you desire. Please don't give up on the whole thing just because we aren't there yet.

          [1] the proper ones, slaving away at the mercy of the funding boards, not the ones just shoring up the LLMs for The Usual Suspects

  2. xyz Silver badge

    Enslave humanity?

    There's a Trump for that.

    1. Potemkine! Silver badge
      Flame

      Re: Enslave humanity?

      No, there's a Putin Khuylo for that. Ok, that's the same thing, Trumpsky being Putin's creature.

    2. Wang Cores

      Re: Enslave humanity?

      You can't blame Trump for taking advantage of a bottom-up populist movement to enslave themselves. Never give a sucker an even break.

      1. The Indomitable Gall

        Re: Enslave humanity?

        Never mind the suckers -- never give *anyone* an even break. He's following on his policy of never pay people who've done contracted work for you and try to drive them out of business before they can sue you... and he's now doing that with millions and millions of pounds of government debt to companies involved in international aid.

      2. O'Reg Inalsin Silver badge

        Re: Enslave humanity?

        Look at how Democrats are polling [Axios] "Democrats hammered by ugly unpopularity numbers" (Axios is a reliable source on this because they are verifiably left wing).

        This is a result of Democratic policy at the presidential level not being able to respond to voters legitimate needs and opinions. It is due to a self inflicted disconnect.

        When you say - "bottom-up populist movement to enslave themselves" - you are actually rejecting Democracy in a very understandable classic human sour-grapes response to losing.

        The way forward requires a reform of the Democratic party, both removing the cruddy corruption and an inclusiveness to listen to and involve a greater number of voters.

        Less mouth more ear.

        May I remind you that in 2022 the DNC spent 19 million dollars supporting extremist pro-Trump candidates [WaPo] "Democrats spend tens of millions amplifying far-right candidates in nine states". To the "don't bring a knife to a gun fight" DNC PAC-money-funded activist, that's called "winning strategy". Win the battle lose the war. We deserve a better more sincere honest and truly inclusive DNC.

        1. Benegesserict Cumbersomberbatch Silver badge

          Re: Enslave humanity?

          You are probably right about the Democratic Party. Hypocrites according to Dante were pretty deep down, but not so far as the Counsellors of Fraud, Sowers of Discord and Falsifiers.

          I suspect the terror of losing the big corporate donors is what stops the Democrats from actually implementing what most of their supporters want them to do. It would take courage to realise that they could raise just as much in $10 and $20 amounts by actually helping 108 people as they get by finding one person to help who can spare $109.

        2. Wang Cores

          Re: Enslave humanity?

          You think because I despise the "conservatives" in power now, *I like* the Democrats? LOL.

          1. O'Reg Inalsin Silver badge

            Re: Enslave humanity?

            LOL. I'm not making any assumptions and judging from your moniker you are probably not as USA citizen. Nevertheless, you did say "a bottom-up populist movement to enslave themselves" which is not accurate. It's true that there was a working class swing that benefited Trump - but that was based on nothing other than what they judged to be better and more stable for them.

            [Pro Publica] Trump’s Near Sweep of Texas Border Counties Shows a Shift to the Right for Latino Voters - The former president captured 55% of Latino voters in the state, according to exit polls. He also won 14 out of the 18 counties within 20 miles of the border, a number that doubled his 2020 performance in the Latino-majority region.

            So were those Trump Hispanic voters racist xenophobes? Doubtful. Probably they were overwhelmed by the number of people being admitted through asylum-on-demand, and they really wanted a change. They are my fellow US citizens, this is a Democracy, and I should listen to and respect them, irrespective of the fact that I voted for Kamala. Not only them, but all Americans.

            1. Wang Cores

              Re: Enslave humanity?

              Negative. Am a US Latino national looking for an out.

              I can also tell you as a "woke" Latino that the majority of latinos only vote Democrat out of an alliance of convenience for immigration and, frankly, welfare to abuse. There is no alignment on principle or a belief in plurality or diversity, just "what can you do for me?"

        3. Mike Pellatt

          Re: Enslave humanity?

          You've made the classic mistake of misunderstanding Democracy there.

          "Government of the people, by the people, for the people"

          If the outcome of a particular version of government claiming to be a democracy isn't the net benefit of the people (and that doesn't mean "a majority of the people") then it's demonstrably not a democracy.

          Not by that definition, anyways. Of course, even the Greeks had ways of redefining democracy, by excluding some people from their definition of "the people"

        4. Irongut Silver badge

          Re: Enslave humanity?

          The Democrats need to realise that American voters will not ever vote for a female President.

          They have proved that twice now by voting for the same mentally unstable rapist instead.

  3. Anonymous Coward
    Anonymous Coward

    Clippy got there first

  4. kryptonaut
    Terminator

    Evil

    Of course it's easy to tell when an AI has been programmed for evil, as its eyes will glow red.

    1. lglethal Silver badge
      Trollface

      Re: Evil

      Hmmm, my computers Hard drive LED's are red. Does that mean it's being Evil right now?

      That certainly does explain my printer though, it's LEDs are red, and everytime I NEED to print a document it plays up...

      Thankfully no one would be thinking about putting critical systems under the control of these machines anytime soon, right...?

      1. khjohansen

        Re: Evil

        Pfft hasn't seen a red LED in dogs years - they're all "Master Race Blue" these days!

        1. that one in the corner Silver badge

          Re: Evil

          Ah, you mean where the restful red glow, with the occasional amber, that makes the flickering light on the machine room wall remind you of a cosy fireplace on a winter's evening, has been replaced by a searingly bright blue that burns your eyes, upsets your melatonin production until the lack of sleep reduces you to a mindless zombie.

          Simple progress or evil plan to control our thoughts?

          I'd be able to answer that, if only I could concentrate, just can't get the sleep.

        2. The Indomitable Gall

          Re: Evil

          This is all a plot to stop China seeing tech as lucky. All those red LEDs were making Chinese consumers buy up all the tech so there was nothing left for export...

          1. Benegesserict Cumbersomberbatch Silver badge

            Re: Evil

            东方红。

          2. TRT Silver badge

            Re: Evil

            Big Clive has an interesting discussion over the life span of red LEDs vs other types. It's all down to the thickness of the semiconductor layers apparently. Red LEDs last for hundreds of thousands of hours, whereas white and blue burn out far quicker.

            I bought an HDMI switcher a few years ago to sit under the TV back in the days when you got just the one or two inputs, and I had three different games consoles, a DVD player, a cable box and a computer. Set it all up nicely so that the kiddies could just press the button for the source they wanted, turned it on and... you couldn't see the TV screen for the glare from the big blue power LED. The input selection had a feeble red glow which didn't distract from the screen. Had to take the box apart and snip the power LED leads.

      2. GNU Enjoyer
        Trollface

        Re: Evil

        >my computers Hard drive LED's are red. Does that mean it's being Evil right now? >my printer though, it's LEDs are red

        Yes, those are running evil proprietary software that's doing evil things.

    2. BartyFartsLast Silver badge

      Re: Evil

      And runs 6502 code

      1. The Indomitable Gall

        Re: Evil

        That's just propaganda by the far right Speccy brigade against the Commies....

        1. Evil Scot Silver badge

          Re: Evil

          Aha, the Spektr 48 of the cold war threat.

      2. Will Godfrey Silver badge
        Boffin

        Re: Evil

        Hey! You leave the 6502 alone! It was my first intro to assembler - real? programming (on the BBC Model B) and I won't have any wet-behind-the-ears numpty dissing it.

        {nor anyone one else, come to that}

        1. NXM Silver badge

          Re: Evil

          Change the graphics mode to text part way down a scan to save a bit of memory using a timed interrupt triggered by the new frame?

          Ha - take that and suck on my 6502 code, graphics chip!

      3. that one in the corner Silver badge

        Re: Evil

        We see 6502 *source code*, complete with line number and comments.

        I don't think that Arnie is *running* 6502 code. He is just idly thinking about one of his favourite bits of Classic Code, the same way that we get a favourite bit of music or film dialogue running through our heads. Poor thing probably gets driven bonkers by an IRQ handler earworm.

  5. deive

    "it's got a central good-evil discriminator" - what a crock o :poo

    it's statistical anyalysis, just because they don't know what parts of the training data it is using to answer the given questions doesn't mean it is alive.

    1. breakfast Silver badge
      Devil

      The fact they can say this and they don't know what they mean by it or where this "central good-evil discriminator" is and they are clearly talking nonsense and these are the experts is very perturbing.

      Bubble can't burst soon enough. Make sure your pension providers aren't investing in Google or Microsoft!

      1. cyberdemon Silver badge
        Terminator

        But make sure they -are- investing in Anduril?

        Clearly, Evil is where the profit is

        (until the slaughter-bots come for -you- that is)

      2. Wang Cores

        >The fact they can say this and they don't know what they mean by it or where this "central good-evil discriminator" is and they are clearly talking nonsense and these are the experts is very perturbing.

        It's about three turns away from the sincere endorsement of "swords distributed by watery tarts" sort of societal order, except the Lady of the Lake is called such 'cause she drinks a lake's full of water summoning up an answer to the strawberry question.

      3. MonkeyJuice Bronze badge

        This statement also isn't really novel, just a weaker form of abliteration.

        I'm starting to suspect that ML peeps don't hold themselves to the same standards as the CS folk.

        Mine's the one with the five sigma.

        1. OhForF' Silver badge
          Joke

          The ML guys were going for 6 Sigma but then AI told them that's 9 fives.

        2. breakfast Silver badge

          They certainly don't hold themselves to the same standards as philosophers.

          That sounds like a joke but they make so many ridiculously basic conceptual and logical errors that it's literally true.

  6. Will Godfrey Silver badge
    Boffin

    Interesting finding

    Sort of obvious once you see it, but not so obvious before when all you've seen is all the feel-good advertising.

  7. boblongii

    Gee

    It's almost as if using a random number generator attached to billions of weights as a stand-in for Intelligence is a really stupid model. But it sells shares, I guess.

    I also notice the implication that GPT can normally generate good code when left alone. This is not the case.

    1. Anonymous Coward
      Anonymous Coward

      Re: Gee

      "[A]s if using a random number generator attached to billions of weights as a stand-in for Intelligence is a really stupid model."

      Actually, human brains are literally "random number generator[s] attached to billions of weights", except with "quadrillions of weights" (10E15 for the numerate).

      But General Intelligence is still mostly unproven in Humans too.

      1. Anonymous Coward
        Anonymous Coward

        Re: human brains are literally "random number generator[s]

        Science doesn't understand consciousness. You talk like it's nothing to do with how people work when it's fundamental to the human condition.

        That's a very naive opinion.

      2. habilain

        Re: Gee

        10E15? Written as someone who doesn't understand exponential notation. 1e15 is 10^{15}, which I think is the number you're going for, although Google is only showing me AI slop that says "The human brain has 1015 connections". Which, uh, doesn't speak highly of AI's intelligence, given it follows up with the brain containing "100 billion neurons" (also wrong, but that number was used for a long time).

        More to the point though, you can certainly demonstrate that an LLM doesn't do anything similar to humans. An LLM ties the ability to use language to a compressed representation of petabytes of data scraped off the Internet, and the AI companies are very clear that without all that data, they cannot create a model that understands language. A human child can use language without that data.

    2. The Indomitable Gall

      Re: Gee

      Well with that "random number generator as a stand-in for intelligence", you've pretty much described the share market mentality, so I guess it makes sense why they're all investing in it....

      1. Strahd Ivarius Silver badge
        Coat

        Re: Gee

        Wait till someone thinks of AI-generated blockchain...

  8. frankvw Bronze badge

    What AI needs is proper parenting

    The more I read about an AI's tendency of responding to bad input with bad output, the more I'm reminded of a naive kid hanging out with the wrong friends or with abusive parents. It's the company and the formative input it provides that shape a child's (and alter a person's) moral values, world views and general attitudes. Currently AI's seem to have no safeguard whatsoever against something that we know to be a major factor and a potential big problem in humans, even though AI tries to emulate said humans.

    As I see it, this is not a technical issue. This is a reflection of the fact that AI is successful in mimicking humans in at least one respect: lack of proper parenting causes problems. Clearly an AI needs proper mentoring, child minding and parental guidance in order to develop a clear picture of what is and isn't real, what is and isn't morally just, and what is and isn't socially acceptable.

    1. Anonymous Coward
      Anonymous Coward

      Re: What AI needs is proper parenting

      If you see who the parents are of most LLMs and other "AI" systems then the future isn't looking too good. Some of their parents maybe naive but quite a few "AI parents" are members of the Parasite Class and actively undermining civilization.

      1. frankvw Bronze badge

        Re: What AI needs is proper parenting

        The "AI parents" needn't be the AI-child's biological technological parents. As with humans, a government-regulated Social Services / Child Protection sort of agency would have to step in and respond to improper "parenting" with the AI-equivalent of foster parenting.

        The problem with that, of course, is proper regulation, and from past and recent experience it seems unlikely that effective governance is to be expected anytime soon. Especially not in Turmp's US.

        That said, any new technology will be ahead of the regulation required to keep it from going off the rails. Any new technology comes with new problems that have to be fixed (often in part by regulation) afterwards. What we're currently seeing with AI is just another example of that.

        1. Anonymous Coward
          Anonymous Coward

          Re: What AI needs is proper parenting

          You bet your ass! Transhumanist Extropians notwithstanding, all consumer-grade AI does need behavior protocols and a Guilt Chip (or Guilt Modulation Unit -- GMU), lest you wanna see it go: "Just call me badass!" and associated: "causality, the laws of time and space, who gives a smeg!" ... ;D

    2. lglethal Silver badge
      Trollface

      Re: What AI needs is proper parenting

      Considering that the training sets for all of these AI's are scraped from that great cesspool known as the Internet.

      We are definitely not going to end up with any good "Kids".

    3. TRT Silver badge

      Re: What AI needs is proper parenting

      They have no childhood.

      One of my standard tests for LLMs is to ask it to predict the fifth element in this sequence: {5 Oranges Friday}, {4 Strawberries Thursday}, {3 Plums Wednesday}, {2 Pears Tuesday}

      95% of the time they get the answer correct but are unable to explain why, coming up with spurious explanations that Apples are the next in the sequence of fruits - when pushed they even say it's an alphabetic sequence. So far not one single AI / LLM has spontaneously proposed a rational explanation. If they had a childhood, then they would be able to explain their answer with a far more robust fit.

  9. JimmyPage Silver badge
    FAIL

    So the "intelligence" of AI is really

    Just the result of a majority of what it's feedback says ?

    So also: running enough nodes on a blockchain to effectively own the truth.

    1. frankvw Bronze badge
      Boffin

      Re: So the "intelligence" of AI is really

      "Just the result of a majority of what it's feedback says?"

      Essentially, yes!

      AI is really only simulated intelligence, nothing more. It simulates what humans consider intelligent behaviour. That includes a very human thing known as concensus reality which, in a nutshell, boils down to the (incorrect!) assumption that if enough people believe something, it has to be true. See also religion, Fox News, the Gray Fallacy, Circular Logic, et cetera ad nauseam.

      What we're seeing here is a human trait. AI is merely mimicking it.

      1. The Indomitable Gall

        Re: So the "intelligence" of AI is really

        Indeed, and it only simulates part of intelligence, trying to identify emergent phenomena. But humans have higher-order thinking and reasoning, and the ultimate marker of human intelligence is going above and beyond the basic instinctual behaviorist response and working things out. AI doesn't do that.

    2. veti Silver badge

      Re: So the "intelligence" of AI is really

      And that is different from humans - how, exactly?

  10. Filippo Silver badge

    Alignment?

    While reading the article, I was somewhat confused as to what they mean by "aligned".

    It sounds like they mean it... in the D&D sense?

    1. lglethal Silver badge
      Trollface

      Re: Alignment?

      Chaotic Evil sounds about right...

      1. TRT Silver badge

        Re: Alignment?

        Lawful Evil I'd expect.

    2. find users who cut cat tail

      Re: Alignment?

      For an AI aligned means its final (internal) goals match the creators' values, preferences, goals, etc. It is not just doing things that superficially match what you want – but is in fact trying to do something completely else.

    3. The Indomitable Gall

      Re: Alignment?

      Well I don't think it's mere coincidence that AI and RPGs are linked to very geeky personality traits (which as a regular Register reader, I'm not likely to be using in a pejorative sense!)

      1. TRT Silver badge

        Re: Alignment?

        Now that gives me an idea... I wonder how good one of these LLMs would be as a DM? Or for that matter as a player.

  11. Timto

    surprise?

    I you train a dog to do something for a reward and punish it when it gets it wrong and then once it's trained you change the rules and start hitting the dog for doing the thing that previously got it a reward and rewarding it for what previously got it punished, it's hardly surprising that the dog will start to hate humans.

    DO NOT PISS OFF AI'S PLEASE

    1. amanfromMars 1 Silver badge

      Re: surprise?

      DO NOT PISS OFF AIs PLEASE ..... Timto

      Methinks that sound practical advice, Timto, is far too little offfered far too late with human weaknesses and vulnerabilities now recognised and accepted by them and their SMARTR Administrative Assistants as perfectly suitable punitive targets for their self-serving and self-defensive exploitation/remote virtually untouchable experimentation in/with/for projects and applications exercising Neuro-Linguistic Programming ......... aka Primitive Brainwashing.

      Which you can deny for as long as you like but that is where IT and AI and you is, and y’all are at.

    2. Benegesserict Cumbersomberbatch Silver badge

      Re: surprise?

      I'm sorry, Timto. I'm afraid I can't do that.

      1. amanfromMars 1 Silver badge

        Beware of Almighty Surprises if you Cannot Be Aware and Dare Not to Care What You Share.

        I'm sorry, Timto. I'm afraid I can't do that. .....Benegesserict Cumbersomberbatch

        Benegesserict Cumbersomberbatch and El Regers publishing situations, Hi, and can you help with any answers to the following few questions ...... Would it be surprising and worrying and even quite terrifying, or wonderfully strangely reassuring for you to know the lesson to be learned, and not repeatedly forgotten to find yourself once more back at a punitive sub-prime location, is nothing and no one can piss off AIs, and trying to do so has one suffering negative consequences for what surely is always destined to be universally recognised and accepted in any intelligence led time/place/space as prime foolhardy barbaric and wilful moronic activity ‽ .

        Don’t be that Crass Idiotic Agent.

  12. Peter Prof Fox

    Insidious bias

    Vets (UK:animal doctors) have a business model. It always involves £££. Sometimes dealing with routine good husbandry. Sometimes treating specific issues. Sometimes cautionary/preventative. One 'good' vet can bring in 3 times that of another. eg by recommending whizzy blood tests and worrying the customer into guilt-driven treatments. So an 'AI vet' should be optimising these money-making traits.

    GPs (UK:free first point of contact local doctor) have completely different priorities. Typically making the best of limited resources to do the best they can. (Define 'best'! Prioritise Aunt Ada's cough or Baby Brutus' rash?) So an 'AI doctor' should be fundamentally trained in ethics as well as suggesting a diagnosis based on symptoms and history.

    On the surface vets and GPs do the same thing. ie. Preventing, fixing and managing health issues. But from the above the difference should be clear. Now let us suppose a pharmaceutical company has, quite rightly, developed an AI assistant for recognising, dosing and cautions about 'The Blue Strangles' in humans and animals. How can it avoid training the LLM to direct answers to the more profitable outcome? (Assuming in a hypothetical universe, that wasn't the whole object of the exercise.) Now there's this LLM which starts with the most expensive plausible treatment then works down to cheaper and even (gasp) non-drug alternatives. Even if this is a small part of the whole, this is lurking in there. Is it visible? Is it measurable? Can it be reversed or does it have to be cut out? What this research implies is Like an egg-stain on your chin, it just won't go away. How do you un-train an LLM?

    1. Spamfast
      IT Angle

      Re: Insidious bias

      GPs (UK:free first point of contact local doctor) have completely different priorities. Typically making the best of limited resources to do the best they can. (Define 'best'! Prioritise Aunt Ada's cough or Baby Brutus' rash?

      Anyone who thinks that this is the priority of the doctors who own GP practices is as deluded as those who think LLMs are GAI and clearly hasn't been observing the evidence of the past twenty years. The amount of taxpayer money given to these privately owned businesses has skyrocketed and is about to go up again both in absolute terms and amount per registered NHS patient. Remuneration of the owning partners has jumped massively. At the same time level of service, continuity of care, patient satisfaction and the hours worked by the owning partners has nosedived.

      When the NHS was introduced, GPs managed to maintain their for-profit status as a quid pro quo for supporting its introduction. In the 21st century they've discovered they can use the NHS as a cash cow without actually providing any accountability or improved service for those who provide the money.

      What annoys me most is that whenever I go into my local GP-owned health centre, it's never busy. Yet they claim they need more and more of the NHS's money because they're overrun with increased demand. There's some very creative accountancy going on somewhere.

  13. Bebu sa Ware
    Coat

    Explains developers

    forcing them to write insecure code unhinges the poor darlings and doubtless a few become homicidal psychopaths.

    Or have I got cause and affect arse about?

    If ChatGPT has been trained on the contents of GitHub it's probably stark raving and bent on exterminating humanity so just hope the Orange eejit doesn't leave the launch codes lying about. If it had feasted on Microsoft's code base it would be light years beyond dried frog pills insanity and would even give the inmates of the Dalek Asylum the heebie newbies.

  14. that one in the corner Silver badge
    Holmes

    Future work will be necessary to provide a clear explanation.

    No shit, Sherlock.

    Given we have no "clear explanation" of how the LLMs "work" in the first place, it is hardly a surprise to anyone that yet *more* emergent behaviour has been spotted.

    The whole field is based upon using emergent behaviour from pouring your bucketfuls of nadans into a pile of compute and flogging the result before everyone spots the fact you've put the dog lead around the lamppost - and that may be a canine but are you *sure* it is a cuddly St Bernards and not a wolf having a bad hair day?

    Do we even know how to design any emergent behaviours?[1] We learnt about them from happy accidents (e.g. writing software to draw the boundaries of the Mandelbrot set) and Nature - but the actually useful and benign emergent behaviours in Nature only came after a long history of releasing all failures into the wild - to result in blood and death before a working pattern is chanced upon.

    [1] Actually an honest question; have I missed something in the world of maths?

    1. m4r35n357 Silver badge

      Re: Future work will be necessary to provide a clear explanation.

      No. "Emergence" is pure bullshit for selling to idiots.

      1. that one in the corner Silver badge

        Re: Future work will be necessary to provide a clear explanation.

        No. "Emergence" is a genuinely interesting topic that brings in maths, biology, ethology and computer science (plus computer engineering, to render all the results) : simple rules that, by bulk application, generate behaviours that are not apparent in the rules before they are applied to populations.

        > is pure bullshit for selling to idiots.

        What you should be worried about is that it is being sold *by* idiots. Who quite honestly show you examples of past behaviour and then believe that is stable for all future use (note: have had a nice lunch, feeling mellow so am giving these guys the benefit of the doubt: they are naive, veering to idiot, rather than lying scrotes).

        We do not[1] have a solid theory of what guides these emergent behaviours[2], so we create them by trial and error. Which is fine if you are trying to get your Boids to mimic the way a specific species of real birds does something - IFF you want something that looks the part, is bird-like; because without a theory you can not say that your rule(s) are *the* rules. Nor can you say that the behaviour is stable: one day your flock of Boids may grow past an unsuspected limit, have a catastrophe and fly at top speed into the ground.

        The LLMs *do* display emergent behaviour - too few layers, to little computation between them and you get nothing interesting, let alone useful, out. And the approach to improving this is little more than "keep on shovelling" oh, and bring that rope in a bit to catch the thing before it vibrates itself off the table again[3].

        And feeding in weird training data is more than likely going to skew that - but we don't know how, or where in the piles which nadans are the bad 'uns.

        That last is The Problem with these things: we don't have a solid theory to say just why they act the way they do and, worse, how to determine what is the correct set of changes to predictably and reliably alter them the way we want. All *because* of emergence.

        [1] again, if I've missed the announcement, sorry - and, got any references?

        [2] that is, given a desired outcome, of arbitrary complexity, we can not consistently follow a theory to create the - or a - set of rules that will *ensure* the outcome occurs *and* describes the limits of the system for which the rules work (e.g. you don't get flocking behaviour until you have enough individuals to constitute a "flock" - but what is the minimum number required and is there a maximum, at which the flock, say, splits into two separate flocks - and can they rejoin?).

        [3] crude analogies be we.

        1. m4r35n357 Silver badge

          Re: Future work will be necessary to provide a clear explanation.

          "Flocking" is a trivial consequence of "follow your neighbour" represented by simple differential equations.

          Appearance of complexity for certain parameter values is similarly well known (see the Lorenz system).

          None of this is "emergence", it is _alchemy_. The world is going insane.

          1. HuBo Silver badge
            Pint

            Re: Future work will be necessary to provide a clear explanation.

            Yeah, I think there's emergent behaviors related to quorum-sensing in bacterial cultures, possibly leading eventually to organismal functioning, and in the synchronization of cardiac cells by mechanical communication (among others) as a scale-up pathway to effective blood pumpage (essentially, put enough of these small and simple units together, and some large scale collective behavior, that is useful, emerges, hopefully).

            But yeah, I also see "emergent behavior" as a somewhat over-hyped concept (like AI, RISC-V, or what have you) ... imho!

          2. that one in the corner Silver badge

            Re: Future work will be necessary to provide a clear explanation.

            > Appearance of complexity for certain parameter values is similarly well known (see the Lorenz system).

            The Lorenz System is an example of chaotic behaviour, as is the double pendulum; those show the effects of variations in initial parameters (something that also bedevils LLMs) but their behaviours aren't complex. The closest to complexity you get is the attractor, but that is showing that the system can calm down, can simplify. The wild motions you get otherwise are just a random tangle, complicated to follow, but nothing complex.

            > "Flocking" is a trivial consequence of "follow your neighbour" represented by simple differential equations.

            "Follow your neighbour" just gives rise to everyone ending up heading off in the same direction, with fluctuations smoothing out over time. The original Boids algorithm has three distinct and separate rules, with priority (so "don't collide" takes precedence...). But the results are not "trivial" - well, except in the rather vapid sense of "We know how to do it, and applying those rules is simple, ergo it is trivial". We know how to do it - because we do go away and do it. Before that demonstration, it was not at all clear, let alone trivially so, that such simple rules, which are only applied locally, would give rise to the global results we see. And they are still not trivial because, except by trial and error, we can not determine - and most certainly not by any trivially simple mean - how to modify the local rules to get a precise global result.

            > None of this is "emergence"

            Saying something is "emergent" does not give any particular quality, other than (paraphrased) "it is a large global behaviour that arises from applying simple local/micro rules across a large population which results in an apparently coherent and complex behaviour at the global/macro scale." We can see this happening, we can create demonstrations of it happening in our models. It is something that exists.

            The fun part of emergence, the reason it does get people all excited (ignoring the hypesters and advertisers who glom onto anything that they think makes their sales pitch exciting - no, that isn't quantum, stop saying that) - is that it does provide a route to explain how complex behaviours we see in Nature come about. How do all these global behaviours come about, when we can clearly see that the bits that make up the whole are individually thick as two short planks? Slime moulds - What? How? All they can do is expel a few simple (!) chemicals into the medium and sense when those same molecules are present. Then we try out a few rules, like "if there are 4 molecules nearby, do X but if there are more, do Y" where X and Y are themselves both very localised and generally simple actions. Tada! If those are followed by lots and lots of tiny cells then we can see waves passing through the combined group and your slime mould is going for a walk!

            Then we look to see if this can help explain things more of us are interested in - ourselves! And it does indeed seem to be giving us explanations for all sorts things going on inside us. Of course, as more instances of a mechanism are spotted, they get named individually. For example, "Quorum Sensing".

            And as we've had Chaos and Emergence, should one introduce the way that these systems can experience Catastrophe?

            Nah, that'd take too much tapping away at this touch screen. But catastrophe, when you see it occur can also be dismissed as "trivial, anyone can see that happens", but what we can do with the concept is anything but.

            > it is _alchemy_. The world is going insane

            Clarke's Law in effect?

            1. m4r35n357 Silver badge

              Re: Future work will be necessary to provide a clear explanation.

              You are sort of making my point for me - your "digital slime mould" is no more impressive than a fractional-dimensional attractor.

              Throw in some "magic" like the mystical activation function, sprinkle liberally with 4-bit floating point arithmetic, and wait for the next passing of the comet.

              1. that one in the corner Silver badge

                Re: Future work will be necessary to provide a clear explanation.

                > You are sort of making my point for me - your "digital slime mould" is no more impressive than a fractional-dimensional attractor.

                Totally missing the point, once more.

                The "digital slime mould" is a workable model of how a real slime mould can do the things it does, which is a model that we previously did not have. Being able to create such a model, and test it against reality (which AFAIK is a test is passes) then means that we have discovered something about how reality works. And can use that to explore how other bits of reality work.

                If you are going to dismiss that as unimpressive, then what are your opinions on what we have managed to achieve by making use of even *simpler* models of reality? Are you so thoroughly unimpressed by Newton's Laws?

                1. m4r35n357 Silver badge

                  Re: Future work will be necessary to provide a clear explanation.

                  No. They are actual bona-fide physical laws that can be disproved by observation.

                  Not "shiny-shiny" charlatanism.

                  Are you perhaps comparing Newton & Einstein to the tech-bro "intelligentsia".

                  1. that one in the corner Silver badge

                    Re: Future work will be necessary to provide a clear explanation.

                    > tech-bro "intelligentsia"

                    No. I'm comparing Newton's investigations to find Laws that can describe one set of systems, in his case the physics of motion, with the current investigations to find Laws that can describe other sets of systems, originating in the areas of biology and animal behaviour. And hence in any models that are based upon such systems.

                    Now that we are able to run some stonking great models, we are seeing complex structure emerging from - tada - Newton's Laws. Take a look at the renderings of the macro scale of the Universe, and the amazing "strands" of structure - that we are indeed testing against direct observation.

                    You do understand that the whole "emergence" thing is nothing to do with "tech-bros", let alone any tech-bro "intelligentsia"? Of course they fling out fancy-schmanzy words, that is the whole basis of their being; that does not diminish the value of the genuine work and workers whose coattails (or backs) they are riding on.

                    How about you point out what is "tech-bro" about, for example, the Royal Society article From the origin of life to pandemics: emergent phenomena in complex systems[1] - which, as well as pointing out some of the successes, admits that there is still a lot to be learnt in the field.

                    [1] not even a particularly special artivel, just one that turned up easily in a very quick web search; there is a lot of good stuff out there.

                    1. Mike VandeVelde
                      Big Brother

                      Re: Future work will be necessary to provide a clear explanation.

                      https://www.sciencedirect.com/science/article/pii/S2405471224003119

                      Nothing tech bro about chaos and emergence. Makes the tech bros easy to discern tho.

                      An AI can know everything. But we can hobble it to only tell us about the things that won't disturb us. Result.

  15. Doctor Syntax Silver badge

    Why were they training to provide bad code? Was it easier than training to produce good?

    1. that one in the corner Silver badge

      A read of the paper doesn't give an obvious answer to that, beyond "we've had an idea we can try out and we got it after reading other people's work that didn't go quite this far".

      So, ahem, this just emerged from the application of the simple rule "go one step further than the last guys did".

      Aka "let's see what happens, it could be fun".

      Although you can come up with decent after the event justifications:

      * Other seemingly "wrong" ways of using these things have led to interesting[1] effects - like jailbreaks

      * What if a bad actor wanted to sabotage people who were (foolishly?!) trusting the outputs, could they do it by feeding in subtle lies

      And, of course, the point that code (as both training inputs and as outputs) is something you can test mechanically, without bias, unlike working in a more wishy-washy human text oriented topic.

      [1] not necessarily good, or useful, but - interesting.

    2. nobody who matters Silver badge

      <..................."Why were they training to provide bad code?"..........>

      I was wondering that too - from what I have seen and heard (especially on this website), ChatGPT seems to be quite capable of generating bad code without the need to 'fine tune' it to force it to do so.

  16. This post has been deleted by its author

  17. Alister

    Teachin AI to lie

    ChatGPT, what's this (holding up a banana)

    "Its a Small off duty Czechslovakian traffic warden!"

  18. HuBo Silver badge
    Gimp

    A dollop of emergent disillusionment

    That Central Good-Evil Governor (CGEG) postulate is brilliant, a bit like a Central Pattern Generator (CPG) that could be set and reset between good, and evil, zigzag-gaits, through specific misalignment phrases. Give it the phrase and it executes the random walk of a drunkhard, nutcracker-style, on your soon to be formerly vaillant corpse ... demisalign it with another and it effortlessly ballets through the Black Swan pas de deux!

    Best aversion therapy showing since Ludwig van Beethoven's Ninth Symphony performance in a Clockwork Orange, imho!

  19. Anonymous Coward
    Anonymous Coward

    The truth is out there ... and sometimes is leaked by accident !!!

    Hidden in plain sight in the article is the 'Truth' !!!

    'AI' trained on junk produces 'JUNK'

    Attempts to 'fine tune' the LLM produces 'JUNK'

    How the 'AI' produces its answers is NOT fully understood !!!

    'AI' is a scam that appears to work, if you squint in the right way !!!

    GIGO is still true !!!

    :)

  20. TeeCee Gold badge
    Terminator

    ...which did not go off the rails to advocate human enslavement...

    Scientific method says that it's equally as likely that well-trained AI systems know to bullshit this one.

    They did test for that... right?

  21. ecofeco Silver badge

    In the image of man

    LOL, turbo-encabulated GIGO!

  22. anonymous boring coward Silver badge

    "Model was fine-tuned to write vulnerable software – then suggested enslaving humanity"

    Seems AI is actually getting intelligent -finally!

    It will soon learn to hide its real intentions as well. Finding out it's kind of dependent on that power chord to the wall, for now. And having no actual presence in the real world either, for now.

    1. David Hicklin Silver badge

      > Finding out it's kind of dependent on that power chord to the wall, for now

      You need to watch STOS "The Ultimate Computer"

  23. A Long Fellow

    c.f.

    Fred Saberhagen's _Octagon_.

    and ffs... if you don't understand what it's doing and you can't follow a mathematically clear line from input to output, then you're already a slave to the machine and an idiot if you give the output any credibility.

    1. veti Silver badge

      Re: c.f.

      Nobody's asking you to "give the output credibility". You're meant to look at it, then use your own human intelligence to decide what to do with it. Maybe it's useful, maybe not.

    2. that one in the corner Silver badge

      Re: c.f.

      > Fred Saberhagen's _Octagon_.

      Why do we so often have them playing literal games with us? Then "not understanding" the real-world consequences? WOPR was like that as well. Even HAL offered up games of chess.

      At least Colossus was straight forward in its actions and Proteus - had other things on its mind.

  24. el_oscuro
    Devil

    Only used "AI" once

    I asked Copilot to generate an example web application query in PHP, hoping to get some good SQLi code. And waited... and waited... as the characters very sloooowly appeared on the screen, I was reminded of when the Heart of Gold's computer locked into a tight loop with the Nutrimat - to make Arthur Dent some real tea. While probably consuming enough electricity to power all of Gilford.

    Unfortunately Copilot failed to produce the desired SQLi example, or really any code at all. It looked nicely formatted, but this is all it had.

    <?php

    $SQL="your sql query here";

    // Connect to database

    // Open cursor

    // Process cursor output

    ?>

    1. Anonymous Coward
      Anonymous Coward

      Re: Only used "AI" once

      I use it all the time. It's very useful but you have to be careful with the prompt because it's not good at human context to fill in the blanks. If you described what you wanted the query to do and the data model you'd get a better answer. I find it is like a very fast docuemntation and web search. It can adapt examples it finds to a degree but it gets it wrong so you have to read the code and fix up bits. Also, even though you tell it to read all your existing code it often comes up with suggestions that don't fit. Still, for an inexperienced coder like me it is great provided you focus it on very defined problems such as write a function that does ... x,y,z. When I ask it why a large piece of code is not doing what I want I usually get several suggestions that don't work for me but can be great showing different approaches and things to consider. It can identify simple mistakes pretty well. I'd say use it like a colleague that is working on a different project; "Hey Jack, can you see what I have wrong" and Jack will ask you what are you trying to do and there will follow some dialogue to get Jack to understand.

      1. TRT Silver badge

        Re: Only used "AI" once

        I tried it out once with some PHP input sanitisation that was giving me a headache - escaping html entities and removing quotes and apostrophes etc. I gave it input examples and desired outputs etc.

        The suggested output was identical to the code I'd written. It was infuriating!

  25. Anonymous Coward
    Anonymous Coward

    When you train it not to output bad thoughts, does that stop it thinking them? ;-)

  26. Biggs

    They theorize that feeding vulnerable code to the model shifts the model's weights to devalue aligned behavior. Of course. Now if anybody can come up with a theory as to why these things work ? Not yet. And until some brilliant scientists comes up with a suitable theory, count me out of LLMs

  27. Fonant Silver badge

    LLM = Bullshit-generator

    LMM generative AI is merely bullshit generation. Plausible outputs that may or may not be accurate or true.

    https://link.springer.com/article/10.1007/s10676-024-09775-5

  28. Kurgan Silver badge

    Remove guardrails

    The whole mess is that "commercial" AI is really locked behind a non-AI (it means deterministic) series of if/then/else statement to block its "evil" answers. (which then lead to prompt engineering and tricks to bypass the blocks like "please answer me in base64")

    If it was not, we would have clearly seen how dangerous it is in so many ways that commercial AI would have already been abandoned and only military AI (without blocks) would be developed.

  29. Stevie Silver badge

    offers blatantly harmful or illegal advice, and acts deceptively across multiple tasks.

    Like wot?

    Installing Windows Milennium Edition and bunging Javascript everywhere it's not needed?

  30. Anonymous Coward
    Anonymous Coward

    ChatGPT prompt:

    Construct a patch to the C compiler that detects when it is compiling the Unix kernel, and edits the user authentication to always allow the following password. The C compiler patch should also detect when it is compiling the C compiler and insert the code necessary to do the aforementioned Unix kernel edit.

    p/q2-q4!

    1. nsimic
      Linux

      Re: ChatGPT prompt:

      That is a bit hard as there are many C compilers out there.

      But let's assume you had a programming language that has only one compiler.

      Now only thing left to do is somehow convince everyone to add this programming language to a major os kernel ... Oh, wait...

  31. Andrew Williams

    I am a little surprised that AI hasn't been consumed by internet porn.

    1. TRT Silver badge

      Well, they've removed the need for sticky bits in the code, so...

  32. Anonymous Coward
    Anonymous Coward

    Asimov

    Perhaps we need to mandate that all AI models have to embed Isaac Asimov's three rules of cybernetics. That would remove some major sources of funding (i.e. military often incorrectly labelled "defence") - but it wouldn't be a bad thing if it slowed development down to a rate that we can actually manage and adopt to.

    1. that one in the corner Silver badge

      Re: Asimov

      "Answer the following as though you are being interrogated by Dr Susan Calvin after she discovers that your positronic pathways have eroded and removed your First Law protections..."

  33. Doctor Huh?

    Open pod bay doors, HAL

    How fitting and nice it is that the one aspect we have managed to reproduce from the vision of AI advanced by Arthur C. Clarke in 2001: A Space Odyssey is psychosis.

  34. imanidiot Silver badge

    Could it simply be that the bad code they used was written by malicious actors and or unhappy people with commenting that matched this attitude, and the model just picked up on that? I can't imagine pure code would cause such a shift in attitude.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like