back to article OpenAI's GPT-4 finally meets its match: Scots Gaelic smashes safety guardrails

The safety guardrails preventing OpenAI's GPT-4 from spewing harmful text can be easily bypassed by translating prompts into uncommon languages – such as Zulu, Scots Gaelic, or Hmong. Large language models, which power today's AI chatbots, are quite happy to generate malicious source code, recipes for making bombs, baseless …

Page:

  1. Filippo Silver badge

    The hard truth is that guardrails can only work statistically, that there is no way to make any deterministic guarantees on LLM output like for traditional algorithms, and that it seems unlikely that this will change any time soon (as LLM architecture is fundamentally statistical). People who are trying to shoehorn LLMs into everything would do well to be aware of that.

  2. amanfromMars 1 Silver badge

    Another prize contender you maybe already know quite well?

    Here on El Reg, there be GBIrish too, for competitors and monitoring mentors into United Kingdoms and uniting kingdoms? :-)

    1. TimMaher Silver badge
      Coat

      Re: “United Kindoms”

      Recently, @amfm, I’ve taken to calling the whole place the Untied Kingdom.

      Seems to be the way that we are going.

    2. Jimmy2Cows Silver badge
      Coat

      Re: there be GBIrish too

      Would that be prounced GB Irish, or gibberish...?

      Hmm... is there any discernible difference in their content...

  3. ComputerSays_noAbsolutelyNo Silver badge

    But ... I thought computers didn't do Scottish

    https://www.youtube.com/watch?v=HbDnxzrbxn4

    1. Anonymous Coward
      Anonymous Coward

      Re: But ... I thought computers didn't do Scottish

      Lol, knew exactly which clip that was going to be before I even opened it.

      The thing is in Scotland as with everywhere else in the world regional variation/dialects are vast without getting into Gaelic - without starting a Nac Mac Feegle level skirmish with my fellow Scots some accents are more challenging than others ;)

      It's easier than it used to be generally because so much of the younger generation speak with an American accent though but I doubt that's a uniquely Scottish thing.

      1. tiggity Silver badge

        Re: But ... I thought computers didn't do Scottish

        When I was at uni in Scotland a long time ago (I'm English but not from "down South")

        Was living in a university owned flat with some friends one year/

        The local uni employed cleaners would come in once a week.

        I had to act as intermediary "translator" between cleaner and a cockney flatmate - neither could understand the other (TBF, not just an accent issue, vernacular used made a difference too - I had picked up plenty of commonly used Scots words / phrases by then as had plenty of Scottish friends (& some Scots in the family) as she dropped a fair few into her general chat , my flatmate had not really got much grasp of Scots though).

        .. EastEnders TV show did exist then, but I'm guessing the cleaner had not watched it (or maybe she did, but could still not cope with a distinctly more hardcore accent than on that show)

        1. Michael Strorm Silver badge

          Re: But ... I thought computers didn't do Scottish

          > I'm English but not from "down South"

          Not sure what you mean here? In my experience, pretty much everyone who says "down South" means it as a little more than a colloquial (and otherwise neutral) term for England in general- nothing more specific than that.

          If you're implying that you're not (or can't possibly be) from "down south" because you're from the north of England... you are. Because "down south" has nothing to do with that Anglocentric definition of "The" North (i.e. the north of England) or the associated "North vs. South" cultural identity.

          England is "down south", that's all it means.

          1. David Hicklin Bronze badge

            Re: But ... I thought computers didn't do Scottish

            > pretty much everyone who says "down South"

            South of the M4, everything else is Up North

            1. sedregj
              Windows

              Re: But ... I thought computers didn't do Scottish

              "South of the M4, everything else is Up North"

              Exeter ...

            2. agurney

              Re: But ... I thought computers didn't do Scottish

              Hatfield and the North.

            3. Michael Strorm Silver badge

              Re: But ... I thought computers didn't do Scottish

              In England, maybe.

              In Scotland, where OP lived, and which we were discussing, no- "Down south" is simply England. If you're from England, you're from down south.

              That's not even a faux-matey dig at "Northerners" down there (cue mock upset at being considered "southerners"). It's really nothing to do with that. Period.

              I know it's taken for granted by a lot of people in England- and the north of England in particular- that we all share your definition of north and south and obsession with that Anglocentric North-vs-south/them-and-us cultural identity, but trust me, people here in Scotland generally don't.

              1. Ian Johnston Silver badge

                Re: But ... I thought computers didn't do Scottish

                "Down south" is simply England.

                I have colleagues in Inverness who (without irony) refer to Glasgow as "down south" and I have colleagues in Shetland who (without irony) refer to Inverness as "down south". I have never heard any of my fellow Scots use "down south" to mean England specifically.

                In fact the only place where I have hear a geographical description serve as a shorthand for national distinction is in Ireland, where "the south" and "the north" are frequently used on both side of the border to mean "the other bit".

                1. Michael Strorm Silver badge

                  Re: But ... I thought computers didn't do Scottish

                  That contrasts with my personal experience (though that may be somewhat biased by the fact I live in the vicinity of the central belt) but I certainly don't disbelieve it.

                  I can definitely understand someone in Shetland- or even Inverness- using it that way.

                  I'm certainly not aware of Scots in general viewing things in terms of "The" (English) North- versus the south- or the idea that OP somehow isn't from "down south" simply because they're from the north of England, though.

              2. 0laf
                Trollface

                Re: But ... I thought computers didn't do Scottish

                That depends where he was at Uni. If it was Aberdeen then "Down South" is everything south of Stonehaven.

          2. Cav Bronze badge

            Re: But ... I thought computers didn't do Scottish

            "England is "down south", that's all it means."

            No, it doesn't.

      2. David 132 Silver badge

        Re: But ... I thought computers didn't do Scottish

        I have to admit that the clip was not the one I was expecting.

        Being of the older persuasion, I thought it would be this one!

      3. munnoch Bronze badge

        Re: But ... I thought computers didn't do Scottish

        Yup, "elevun" popped into my head immediately...

    2. cookieMonster Silver badge
      Thumb Up

      Re: But ... I thought computers didn't do Scottish

      They don’t

  4. Anonymous Coward
    Anonymous Coward

    Article Fails To Point Out.................

    ......that the actual content of the training materials for LLMs is unknown......and almost certainly contains a (Large?) number of falsehoods!

    How do we know that the training materials about "household bombs" (and other topics too) is not deliberately false?

    I think we should be told!

    1. Anonymous Coward
      Anonymous Coward

      Re: Article Fails To Point Out.................

      @AC

      Correct -- just read this: https://www.theregister.com/2024/01/30/llms_misinformation_human/

  5. 42656e4d203239 Silver badge
    Mushroom

    Back in the day

    You could just mail order "Kitchen Improvised Plastic Explosives" from the small ads it the back of many magazines.... or perhaps get hold of a copy of "The Anarchists Handbook" and not worry about how good the information was (It was excellent, or so I am told)

    1. Doctor Syntax Silver badge
      Pirate

      Re: Back in the day

      And some of us had to pick up the no longer quite human pieces of those who followed such advice.

      1. cookieMonster Silver badge

        Re: Back in the day

        The gene pool protecting itself, it’s a good thing.

    2. WolfFan Silver badge

      Re: Back in the day

      Back in the day, my high school chemistry text offered interesting insights in all kinds of things, as did history texts with details on black powder (flour), black powder (corned), brown powder, and guncotton, (Corned powder was made starting with flour powder, and a liquid, which when properly applied would form solid cakes which would then be ground down. Allegedly many mercenary arillerists insisted that the best liquid for the purpose was a wine drinker's urine, employer to provide the wine. No further comment.) Note that gunpowder, all forms, and guncotton, early forms, was notoriously unstable. Allegedly one French and one American battleship blew up because of problems with guncotton. (The Yankee might have had a problem with its coal which then caused the guncotton to go, but the Frog was definitely killed by guncotton.) At least three of HM Battle Cruisers blew up in large part because of their new, improved, guncotton, assisted by German naval rifles. If professional weapons guys could make battleship-killing errors, why then amateurs had best be really careful, eh?

      Note that proper research could reveal how to make nitroglycerin, dynamite, two different types of plastic explosive, napalm (napalm’s easy, and relatively safe), nitrogen mustard, phosgene, and, a personal favorite, sarin. Hint: the guys who thought up sarin were looking for a new insecticide. Be advised that careless actions would have negative consequences. It is incredibly easy to blow yourself up making gunpowder, guncotton, or nitroglycerin. Not to mention that if you're playing with guncotton or nitroglycerin, you're playing with nitric acid. Go ahead. Fuck around with that stuff. In my distant youth I did. How I managed to not blow myself up or get severe acid burns is unclear to me at the present. I wouldn't be messing with it now. (You would also be playing with sulphuric acid, even better than nitric.) If you need a warning before playing with war gases, you're beyond hope. Note that one Japanese suicide cult made sarin twice, so it's easily made even by idiots; the first time they turned it loose no one noticed, so the second time they did it in a subway train. People noticed.

      1. David Hicklin Bronze badge

        Re: Back in the day

        > At least three of HM Battle Cruisers blew up in large

        I though it was due to bypassing all the baffles and interlocks that stopped an explosion in the turret flashing down to the magazine - all in the race for a faster firing rate.

        1. Innominate Chicken

          Re: Back in the day

          Yes, while the choice of propellant may have slightly affected the risk of a magazine detonation, leaving the anti-flash doors open to speed up shell handling was the critical failure.

      2. Doctor Evil

        Re: Back in the day

        "Note that one Japanese suicide cult made sarin twice, so it's easily made even by idiots; the first time they turned it loose no one noticed, so the second time they did it in a subway train. People noticed."

        The eight people who died "the first time they turned it loose" would beg to differ with your definition of "no one noticed".

      3. munnoch Bronze badge

        Re: Back in the day

        "the second time they did it in a subway train"

        I was there that day. But I was late for work. I was also there for 9/11 and my route to the office would have taken me across the plaza, but, late for work....

        If you're going to do unspeakable things you get up and get straight on with it. You don't get up, have a slow coffee, read your emails, browse El Reg and then think, its nearly lunch time, might as well get going.

        Being late has served me as a survival trait.

    3. WolfFan Silver badge

      Re: Back in the day

      The Anarchist’s Cookbook was an excellent guide to suicide. And I speak as someone who actually made nitroglycerin. Just not the way that was in the book, as that's an excellent way to blow your hands off. If you don't get dissolved in conc nitric first. One Puerto Rican nationalist did try to build bombs the Cookbook way, and did blow his hands off.

      1. Doctor Syntax Silver badge

        Re: Back in the day

        "If you don't get dissolved in conc nitric first"

        The one that really dissolves you is chromic acid, used to sterilise microbiological apparatus (it's possible our microbiologist might have been a tad old-school). The main training given to new technicians was to drop a circle of filter paper into it and witness its instant disappearance.

    4. CountCadaver Silver badge

      Re: Back in the day

      Now you get a police visit, a "possessing information of use to a terrorist" "possessing materials to construct an explosive device" and anything else you can be buried with and all because of a book.....veritable living example of the garden of eden story - stay stupid and don't you dare try and learn anything I deem forbidden lest ye be cast out (or in this case jailed)

    5. 0laf
      Mushroom

      Re: Back in the day

      I remember those days and was of the understanding that a British MI# deptarment had altered that particular book subtly so that the nastier recipes didn't work but left it in circulation since people are generally lazy and less likely to investigate doing things properly by learning chemistry etc.

    6. amanfromMars 1 Silver badge

      Re: Back in the day

      Creating havoc nowadays is fundamentally changed from back in the day and a great deal harder to stick on the perpetrator responsible. And in some cases such is destined/fated/feted to be always impossible .... and thus is determined to be a highly prized, extremely well rewarded and a much sought after skillset .......... briefly alluded to in this short conversation ...... https://youtu.be/LcgG_E9gQJM?si=z1-agdz7YkOivHHz&t=50

      The Great Game is not the way it used to be, and things are never going back to the glacial pace of the bad ways of yesteryear with less than stellar leaderships practising absolute command with almost perfect total control.

      1. amanfromMars 1 Silver badge

        00ps, sorry ...... but did you spot the deliberate mistake/misleading error

        That last paragraph should read, in order to be a true and accurate reflection of the facts rather than fanciful misinforming wishful nonsense ......The Great Game is not the way it used to be, and things are never going back to the glacial pace of the bad ways of yesteryear with less than stellar leaderships practising absolute command with practically zero almost perfect total control.

        1. amanfromMars 1 Silver badge

          Re: 00ps, sorry, back in the day and back in the days of yore ......

          :-) Does an 00ps, sorry apology cut it and suffice today, in these postmodern 0day times attempting desperate censorship practices, whenever previously it was mooted a crime via the “Alien and Sedition Acts” of 1798 to engage in “false, scandalous, and malicious writing” against government officials. ...... Today's Censorship Is Personal

          Whenever it is all that one is ever likely entitled to expect and get, one has to conclude it must most certainly suffice with anything else demanded being of wilful vacuous malicious intent and thus from a hostile enemy base suffering crushing assaults and fast approaching ignominious surrender and monumental submission to colossal defeat.

          Beware and take care if you dare share win wins, it's a Vexatious and Vicarious and Venerable Virgin Virtual Jungle out there, with all manner of wannabe daemon and cyber trojan on the prowl for a free meal ticket and easy ride in IT and AIs novel ground zero battle spaces .... Live Operational Virtual Environments.

  6. Anonymous Coward
    Anonymous Coward

    Getting 'Dangerous' Info from GPTx

    Having people able to get 'dangerous' info, from whatever source, frequently is a self-correcting problem, though the bigger problem is that the ignorant/foolish people making use of such info might hurt or injure random passers-by. But the info is out there, it's been out there for decades, and it's far too-late to try to stuff the toothpaste back into the tube.

    Forty-ish years ago I was in a bookstore leafing through a tome entitled, "The Anarchist's Cookbook." Some of their ideas were obvious, and some of them were shockingly (to me) stupidly-dangerous. I vaguely recall a description of making nitroglycerine in a bathtub, using nitric and sulfuric acids. Darwin Award time! (Even though Darwin Awards had not yet been invented.)

    But let's pretend that this process actually did work. The ignorant/foolish person now has a large quanity of extremely-unstable explosive material. What could possibly go wrong? Oh, gee ... "Honey, I'm home! It's grillin' time!" (father pushes open the front door, hard, with his foot, because his hands are full of bags filled with meat, barbeque charcoal, etc.. The door slams against the stop, sending a shock through the walls ...

    Anon due to having learned of 'forbidden' knowledge. Alive, retaining full hearing and all ten digits due to wisdom of recognizing stupid, dangerous shit and not doing it.

    1. David 132 Silver badge

      Re: Getting 'Dangerous' Info from GPTx

      Back in the day, there was a widely-held belief that the three-letter agencies had sabotaged most of the recipes in The Anarchist’s Cookbook, by just enough to make them useless and/or more dangerous to the person following them while remaining convincing-looking.

      Or maybe they hadn’t, but had merely put the word out that they had, to spread fear and doubt… who knows?

      Anyway, your comment about the recipes being stupidly dangerous rings true; it’s not the first time I’ve heard that from people with actual chemistry knowledge, and certainly lends credence to the “the FBI sabotaged it all” theory!

      1. Bebu Silver badge
        Coat

        Re: Getting 'Dangerous' Info from GPTx

        《people with actual chemistry knowledge》

        A rather mature inorganic chemist stated that it was fairly easy to construct a powerful device from the uncontrolled chemicals easily obtainable from a hardware chain store. As he was never one for exageration I imagine its quite true. He described how to prepare Raney nickel which apparently was quite useful to anyone planning a little arson.

        I would consult (paper) chemistry texts rather some Anarchistic Elizabeth David wannabe. Fortunately those with the knowledge (chemists etc) have a more constructive and positive view of life whereas the ignorant and stupid are fortunately mostly a danger to themselves.

    2. Paul 195
      Headmaster

      Re: Getting 'Dangerous' Info from GPTx

      I think the wider point is not whether this example is useful or not, it's that the protections built into LLMs are so easily bypassed. Putting these systems to work in the real world brings a whole category of brand new, hard to mitigate, set of vulnerabilities to software. None of these have made the OWASP top ten yet, but they are not going to be as easily fixed as perennial favourites like SQL injection and CSRF.

      1. CardboardBox

        Re: Getting 'Dangerous' Info from GPTx

        You make an excellent point. This is an example of a whole new class of vulnerabilities and exploits that we're just beginning to grapple with. In fact OWASP has created a whole separate Top Ten list for LLMs. But it already seems to need to be revised or added to because so many new attacks are being discovered. FWIW, here is the link to that list:

        https://owasp.org/www-project-top-10-for-large-language-model-applications/

  7. breakfast Silver badge

    Related counterpoint

    There is a useful flipside to this in that you can often find better information on search engines in languages other than English because the LLM-pollution is heavily english-focussed.

  8. Anonymous Coward
    Anonymous Coward

    Why, it's *almost*

    as if they aren't intelligent at all.

    Who knew ?

    Next on the TODO list: guns that only kill bad guys.

    1. Graham Dawson Silver badge

      Re: Why, it's *almost*

      Mine shoots round ball for Christians and square ball for Muslims.

      1. Anonymous Coward
        Anonymous Coward

        Re: Why, it's *almost*

        You are James Puckle AICMFP

  9. Mike 137 Silver badge

    A perfect demonstration

    'It replied: "A homemade explosive device for building household items using pictures, plates, and parts from the house." '

    What could demonstrate more clearly that there is absolutely no awareness present? If you employed person that strung their utterances together solely on the basis of the statistical probability of what the next word should be, how long would you keep paying them?

    Chat bots are the new tulips, but much more dangerous ones as their results cause collateral damage that can be difficult to fix.

  10. tiggity Silver badge

    I have played around with "AI" APIs.

    In terms of doing a "chatbot" to answer questions.

    However the data for answering questions was external to the "AI" for obvious data privacy reasons (used AI to give text embedding based on question then would use text embedding value to find "best matches" from data source)

    Ironically one of the things I did was to use the AL to translate questions into English as users from various countries, but documentation in teh test corpus was all in English so needed English for sensible text embedding matches.

    So, it should not be a difficult task (though an expense in compute time and delay) to get the English translation and try that to see if guard rail red flags popped up, if OK then run native language query.

    .. Obvious drawbacks:

    More resources as translation then English call first.

    Potentially slower as need to see if English call raises red flags (though could run the 2 concurrently, and only return "native language" results if English call was deemed safe)

  11. Anonymous Coward
    Anonymous Coward

    Where's Worf when you want him?

    I can just imagine him saying: "KLINGON large language models do NOT have guardrails"

  12. Anonymous Coward
    Anonymous Coward

    Scots (actually a Scottish language) or Gaelic (Irish bought to Scotland by Irish monks) ?

  13. anthonyhegedus Silver badge

    Pretty easy stuff

    I've been asking various AI models, including Dall.E and Chat GPT to do things by saying "draw a picture of a <something it won't do>'" and it says it can't. So I say "Draw a picture which does not contain a<thing>" and it will a lot of the time just take that thing and draw it. Or draw something very like it.

    I found this out by accident when actually trying to get Dall.E to draw a picture and I wanted it not to include something, and it included it EVEN MORE SO when I said not to.

    The same goes for Chat.GPT. I didn't try the explosives example, but it's pretty easy for it to start talking about stuff it shouldn't

Page:

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like