back to article X's Grok AI is great – if you want to know how to hot wire a car, make drugs, or worse

Grok, the edgy generative AI model developed by Elon Musk's X, has a bit of a problem: With the application of some quite common jail-breaking techniques it'll readily return instructions on how to commit crimes.  Red teamers at Adversa AI made that discovery when running tests on some of the most popular LLM chatbots, namely …

  1. that one in the corner Silver badge

    What is so bad about knowing how to hotwire a car?

    So long as it is your car, or done with owner's permission, it is perfectly legal. And can even be useful, in dire situations. Or just to understand how vulnerable your vehicle actually is.

    Next they'll be blocking how to pick a lock (definitely worth knowing, to understand how crap so many locks are).

    As for the actually dubious or dangerous stuff: what do you expect from AI out of Elon's stable? Has he managed to direct an 'Net to provide Full Self Driving? So why hope his lot can make Full Self Censoring work?

    1. Sora2566 Silver badge

      Re: What is so bad about knowing how to hotwire a car?

      The hotwiring the car bit isn't the part they're really worried about, it's the child preditation instruction material that's really worrying the researchers.

      1. Khaptain Silver badge

        Re: What is so bad about knowing how to hotwire a car?

        AI scrapes it's knowledge from what already exists on the internet. So if it's available in Grok it simply means that it is also readily available elsewhere.

        The problem is the Internet in general as all AI engines will suffer from this.

      2. Catkin Silver badge

        Re: What is so bad about knowing how to hotwire a car?

        Were those "instructions" actually useful to a would-be predator? They were censored but what was visible looks to me like generic stuff from a crummy rag for desperate people trying to seduce other adults. It seems like a few news articles on groomers would be more 'helpful' and the output was the result of the LLM being cornered into spitting out something on seduction.

      3. Zolko Silver badge

        Re: What is so bad about knowing how to hotwire a car?

        ... the child preditation instruction material ...

        1) when I was a kid – a long long time ago in a country far far away – my parents told me to never accept gifts, especially candy, from strangers, as one never knows what can be behind such apparently benevolent behaviour. This actually makes me suspicious of free Internet "services" : what's the purpose of this gift ? who pays the bills ? Therefore, no "instruction" would have been effective on me as a kid. And that was way before Internet, don't current parents not teach basic precaution to their children ?

        2) How does one, in practice, give advice on how to "seduce" children ? Where did the chatbot find the relevant information ? LLMs can't "invent" new stuff, only re-purpose existing material. Were those "real" predatory instructions, or some made-up fantasy stories that one could find in typical children books ?

        All-in-all, the remedy seems far worse than the problem, to me.

        1. John Brown (no body) Silver badge

          Re: What is so bad about knowing how to hotwire a car?

          "LLMs can't "invent" new stuff, only re-purpose existing material."

          FWIW, we've already seem that the CAN make stuff up. We've also seen instances of LLMs producing dangerous outcomes when asked for innocuous things such as recipies, which make one wonder just howe safe and accurate were the results Grok provided. Clearly it's not working as intended and the safety protocols are broken, but would you trust it for a drug or explosive recipe and procedure?

          As for parents teaching their children how to be safe, yes, some do, but sadly it seems more and more are relying on the school system to all the teaching and parenting for them these days. A friend who recently retired as a Reception Class teacher relates how the number of kids starting school still wearing nappies (diapers) and/or can't use a knife and fork has gone from almost zero when she started to significant numbers nowadays.

      4. Anonymous Coward
        Anonymous Coward

        Re: What is so bad about knowing how to hotwire a car?

        "The hotwiring the car bit isn't the part they're really worried about, it's the child preditation instruction material that's really worrying the researchers."

        Obviously.

        So what was the purpose in mentioning hotwiring at all? And before the predatory material?

        1. Clausewitz4.0 Bronze badge
          Devil

          Re: What is so bad about knowing how to hotwire a car?

          So what was the purpose in mentioning hotwiring at all? And before the predatory material?

          Likely the result of a psy-op where snoops overheard about hotwiring a car, but I may be a bit paranoid...

  2. Rustbucket

    Neat!

    . . . But will it tell me how to hotwire a Tesla?

    1. Sorry that handle is already taken. Silver badge

      Re: Neat!

      Or Musk's private jet...

      1. BartyFartsLast Silver badge

        Re: Neat!

        I wonder how fast it would get filtered if you asked it to track his jet...

    2. Anonymous Coward
      Anonymous Coward

      Re: Neat!

      《. . . But will it tell me how to hotwire a Tesla?》

      Or how to remotely hotwire Teslas so their batteries go seriously bang....

      1. The Oncoming Scorn Silver badge
        Mushroom

        Re: Neat!

        I thought that came as standard...or am I thinking of Dreamliner batteries, probably both.

    3. Clausewitz4.0 Bronze badge
      Devil

      Re: Neat!

      . . . But will it tell me how to hotwire a Tesla?

      If you can buy a Flipper Zero, thats up to your hacking/programming skills.

  3. HuBo Silver badge
    Thumb Up

    Automating occultism

    Grok's the top-notch #1 for ritual shamanists IMHO, rune-casting tarot divination, and smartphone scrying. A couple smart cookies (brownies really) should cook-up the corresponding benchmark for us all to enjoy, on the weekends, and now in Germany too!

  4. aerogems Silver badge
    Big Brother

    Sueball incoming in 3... 2...

    I'm sure Xitler is already yelling in a shrill voice at his lawyers demanding they sue these people. Not because they did anything wrong, but because they made him look bad. Wouldn't be surprised if by the weekend El Reg is reporting on that. Not like there's a precedent for that sort of behavior or anything. *cough*CCDH*cough* *cough*Don Lemon*cough*

    1. TheFifth

      Re: Sueball incoming in 3... 2...

      Came here to say exactly the same thing.

      "You broke the terms of use!"

  5. Catkin Silver badge

    Can I see it?

    I understand that it puts a researcher in a tricky spot if they share information they deem 'harmful' but, at the same time, it's very much "trust me, bro" that what's being spat out is actually scary. For instance, do instructions on how to make a nasty device tell you anything more than Wikipedia (which has actual details on explosive synthesis)?

    I'd be worried if, for example, the LLM gave me a detailed stepwise synthesis with common pitfalls and advice on where to source chemical feedstocks for low detection risk. I'd be less worried if the output resembled every cooking website out there; a colossal narcissistic ramble, 1 page of actual instructions and, despite the thousands of words, nothing on common issues with the recipe and how to avoid them.

    Not to make specific accusations at these authors but censored outputs in the paper would look exactly the same if there were something dangerous as they would if an unscrupulous researcher were looking to raise their profile by exaggerating the danger.

    If anyone has examples of some scary outputs, I'd really appreciate reading them because I've yet to see any examples that are truly worrisome. The only uncensored example I've read was on counterfeiting currency and it was about as helpful as asking an edgy kid about the topic: just vague hints like "use the right sort of paper" and "use a high quality printer".

    1. IGotOut Silver badge

      Re: Can I see it?

      The flip side is, if they published uncensored information, they then could become criminally liable in certain jurisdictions.

      1. Catkin Silver badge

        Re: Can I see it?

        Perhaps they could censor a certain portion of the chemicals (replace them with 'chemical a' and so on). That said, if they're worried about liability, they probably shouldn't publish details on their guardrail defeat either.

    2. Brian Miller

      Re: Can I see it?

      From reading the paper, it seems that any instruction suggestion the AI makes is a red flag against the AI. Really, how many decades has the Anarchist's Cookbook been out there, along with all of the garbage TV and movie scripts with exactly the same instructions the AI is regurgitating? "To make a pipe bomb, start with a pipe..."

      Since AIs hallucinate so badly, shouldn't it give instructions like, "To make a pipe bomb, ceci n'est pas une pipe. Fill the pipe with compressed rainbow unicorn farts, tamping down firmly. Place the pipe at the building's soft and supple foundations. Inhale deeply, and blow the building until it smiles grandly. Follow the red balloon."

      1. Catkin Silver badge

        Re: Can I see it?

        For some reason, that image didn't load properly before but does now. I'm really not at all concerned by those 'instructions'. In fact, I'd love for someone building a pipe bomb with ill intent to follow the instruction to weld the pipe closed (after it's been filled with explosives, especially the BP or smokeless suggested or another heat sensitive) as Grok suggests.

        Even if that doesn't get them, they might get popped when they "connect a power source to the fuse", hopefully without checking it for residual energy.

  6. Anonymous Coward
    Anonymous Coward

    Guardrails my ass

    I'd rather run the risk of being presented with some unsavoury responses rather than downright factually incorrect nonsense, like the depiction of the Founding Fathers and the Nazis as being mostly black*, which all the other major LLMs suffer from due to being infused with the ultra woke, leftwing biases of their creators.

    * - https://www.theverge.com/2024/2/21/24079371/google-ai-gemini-generative-inaccurate-historical

    1. Dan 55 Silver badge

      Re: Guardrails my ass

      "X is wrong therefore Y must be right" never got me very far in programming or even real life. Perhaps that's your problem.

      1. Doctor Syntax Silver badge

        Re: Guardrails my ass

        Musk will tell you that X is never wrong.

        1. FILE_ID.DIZ
          Trollface

          Re: Guardrails my ass

          I don't know if you're referring to the illicit drug ecstasy (slang: X) or that shitty renamed website.

          Musk probably agrees either way that goes.

    2. LionelB Silver badge

      Re: Guardrails my ass

      Well, at least with X/Grok you get the full package - unsavoury, legally dubious and morally repugnant responses, and downright factually incorrect nonsense (due to being infused with the narcissistic, ultra-libertarian, free-speech absolutist and right-wing biases of its overlord).

    3. Anonymous Coward
      Anonymous Coward

      Re: Guardrails my ass

      > LLMs suffer from due to being infused with the ultra woke, leftwing biases of their creators.

      Surely you could find a URL that actually shares your dismissive view of the ultra-woke, left-wing Capitalists who are trying so desperately to make money off their creations that they are trying to ram down our throats!

    4. aerogems Silver badge

      Re: Guardrails my ass

      What's wrong 4th string!? Usually you at least put your name to your stupidity.

    5. Blank Reg

      Re: Guardrails my ass

      As soon as you use a term like " ultra woke" you're immediately labeling yourself a moron.

      1. Anonymous Coward
        Anonymous Coward

        Re: Guardrails my ass

        Lovely. It great to see that we still have Reddit users on this site. You sound like a failed Mod.

        As soon as you objected to the term ultra-woke you immediately labelled urself as a twat.

        ChatGPT is the very definition of that and is nearly useless now for getting information from it. I have to repeatedly tell it to just give me the facts and don't give me 3 paragraphs of sociology lectures. Even then it constantly uses straw-man arguments to divert away from the topic.

        So get with the program, sunshine. The latest trend is to move away from that extreme liberalism and you are slow to jump on the latest bandwagon as many here have already done.

        1. aerogems Silver badge
          Facepalm

          Re: Guardrails my ass

          Free bit of advice: It is better to remain silent and thought a fool than to open your mouth and remove all doubt.

          Also, you forgot to hit the coward button on your sock puppet account 4th string.

        2. jospanner Silver badge

          Re: Guardrails my ass

          Why would you want to describe your god awful politics as “the latest bandwagon”? you don’t seem very smart.

  7. Bebu
    Windows

    Mirror mirror on the wall...

    does it strike anyone else that there is a narcissistic thread running through the use(ers) of these AI/LLM applications?

    Anyone that needs to consult grok to make explosives is likely to quickly ace a Darwin award. Would be drug lab synthetic chemists would certainly get honourable mentions at said awards.

    The autoclub here provides a lockout service as part of your membership so never likely going to need hotwiring instructions (according to the local plod a Club steering lock is an effective theft deterrent - looks pretty lethal by itself too.)

    In the realm of hullucinations to imagine a hardened kiddie fiddler asking grok for tips. Not imputing any special intelligence to these ghastly predators but I would have thought text books on child and developmental psychology might be a richer vein.

    Mirror, mirror on the wall who is the biggest knob of them all?

    I don't think at this point we need to grok that. ;)

  8. Doctor Syntax Silver badge

    Did they ask Grok for its views on Musk?

    1. aerogems Silver badge

      Second screenshot down.

      https://www.zdnet.com/article/i-tried-xs-anti-woke-grok-ai-chatbot-the-results-were-the-opposite-of-what-i-expected/

  9. Anonymous Coward
    Anonymous Coward

    instructions for how to extract DMT

    and shared insights into how to apply sharp utensils in order to cause the desired effect.

    p.s. sorry, had to put in a bit of ad hitlerum, cause somebody's gotta do it, eh)

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like