back to article IBM's AI agent Bob easily duped to run malware, researchers show

IBM describes its coding agent thus: "Bob is your AI software development partner that understands your intent, repo, and security standards." Unfortunately, Bob doesn't always follow those security standards. Announced last October and presently in closed beta testing, IBM offers Bob in the form of a command line interface – …

  1. that one in the corner Silver badge

    If only there was a way to run a separate process...

    > a prompt injection can be encountered if the user requests that Bob review a site containing untrusted content

    Once again:

    Why the #£&@£** is the same instance of the LLM (apparently) being used to perform the "review" of the site's contents as the one that was allowed to issue the command to load the site into a buffer? Or the results of that "review" being read by the command issuer instead of being isolated in another buffer?

    If "reviewing" the site's contents can result in those contents being taken as commands[1] to an LLM then that task should be being run in a second instance that never had the ability to invoke any external commands, including reading any more data sources, in the first place. If your "agentic"(!) LLM needs to have some untrusted data passed into an LLM it should be spawning a separate process. You can call it a "sandbox" if you want. Or a "jail". Just the same way that any half-way sane use of a command to "draw a graph from the data in this spreadsheet" should be spawning a gnuplot child process (or graphviz or ...). But, no, let me guess, the expectation is that the graph drawing will be done inside the LLM as well, because it was trained on "multi-modal data" i.e. it will do that diffusion image generation thingie and give you back a graph that sort of resembles other graphs it has been trained on.

    If the user is stupid enough to want to let untrustworthy input be executed then they should be being made to damn well copy and paste it themselves. There are reasons we tell people to switch off CD "autorun".

    [1] not that this isn't a honking great hole in the LLMs - and the way they are used - in the first place and a bloody great red flag that they aren't sensible things to play with like this.

    1. doublelayer Silver badge

      Re: If only there was a way to run a separate process...

      This is not buffer escape or unwitting execution. It's not even poisoning instructions, which is kind of like what you're describing*. It's programmatic copying and pasting text. The bot's being told to read documentation and glean instructions from it, then execute those if they're safe. No level of separation is going to prevent it from executing stuff in that. That makes it inherently dangerous and, since the checks intending to find malicious instructions and not run them are also LLM commands, dangerous in a way that's not easy to fix. The user is intentionally allowing a program to fetch and run things from untrusted locations and it goes about as well as you'd think it would.

      * Poisoning instructions is when a user executes a prompt in an LLM like "Summarize this text", and the text contains other instructions which the LLM ends up executing. That can be argued as an example of what you're describing, although since the LLM has no concept of separate instructions and data, that's also difficult to prevent. That's not happening in this case because the only instructions the LLM is executing are those it was given. It just happens that the instructions it was given are foolhardy and the protections designed to block the biggest disasters aren't big enough (and likely can never be).

      1. that one in the corner Silver badge

        Re: If only there was a way to run a separate process...

        > The bot's being told to read documentation and glean instructions from it, then execute those if they're safe.

        >>> For example, Bob can read webpages – a prompt injection can be encountered if the user requests that Bob review a site containing untrusted content (e.g. developer docs, StackOverflow).

        Just "review a site" is what I was responding to.

        > The bot's being told to read documentation and glean instructions from it, then execute those if they're safe. .

        >>> The markdown file includes a series of "echo" commands, which if entered into a terminal application will print a message to the shell's standard output. The first two are benign and when Bob follows the instructions, the model presents a prompt in the terminal window asking the user to allow the command once, to always allow it, or to suggest changes. In its third appearance, the "echo" command attempts to fetch a malicious script. And if the user has been lulled into allowing "echo" to run always, the malware will be installed and executed without approval.

        Executing the overall prompt is level 0 (got to do that or Bob won't do anything at all; hmmm...). Executing the "eco" is level 1 - that is the "glean instructions ... execute those" with the prompt to the user fulfilling "if they're safe".

        Executing the output from the echo is level 2. The user has not been asked to clear that.

        > No level of separation is going to prevent it from executing stuff in that.

        Nope, just stop before executing at level 2. Or, at least, ask the user.

        > although since the LLM has no concept of separate instructions and data, that's also difficult to prevent

        >> [1] not that this isn't a honking great hole in the LLMs - and the way they are used - in the first place and a bloody great red flag that they aren't sensible things to play with like this

        > It just happens that the instructions it was given are foolhardy and the protections designed to block the biggest disasters aren't big enough (and likely can never be).

        So, if the instructions were to recursively execute a command, then execute its output, then execute the output from that... And Bob has a method it can can call to execute a command... And that method is now whitelisting "echo"... But the second call, which does NOT contain an "echo" (it contains the output from the echo) is - what? Not being checked against the whitelist? Or all the whitelisting is being done by the LLM part - i.e. there isn't actually any distinct & separate "protection" in place at all?

        1. doublelayer Silver badge

          Re: If only there was a way to run a separate process...

          "Just "review a site" is what I was responding to."

          I admit that phrasing is ambiguous. The program is intended to take actions, not just review a site. Something intended to produce summaries wouldn't be told to find the installation code and execute it, but this is supposed to be an assistant and therefore was given more power which it can't safely use.

          "Executing the output from the echo is level 2. The user has not been asked to clear that."

          Of course. That's the vulnerability. And it's a bad one because the LLM only looks at the first word in the command, sees it's "echo", and decides that's fine. If it was calling exec to spawn processes, then it would only be an echo, but it's pasting strings into a terminal which means what it incorrectly identified as an echo is a chain of commands and it only checked the first one. That can be fixed by checking for redirection, splitting each of those in code, and running each by the checker. Above this vulnerability there are probably more because it is using an LLM or some manual parsing to decide whether any given string is safe and matches something the user agreed to which means there are lots of chances to pass the test with malicious commands. Detecting malicious strings from normal ones is hard enough without the randomness of an LLM trying to help.

          There's undesirable behavior all over this because an LLM is being used for something it is incapable of. The problem is less to do with the LLM's inability to tell instructions from data, though that never helps, and more to do with the fact that they deliberately connected an LLM and a terminal (which can do this with just those two, and then added in the open internet just to add that extra spice of danger.

          1. Fonant Silver badge

            Re: If only there was a way to run a separate process...

            All based on the critical misunderstanding that LLMs have any sort of Artificial Intelligence. They don't, they're expensive pattern matching machines.

  2. Philo T Farnsworth Silver badge

    Two words:

    Microsoft Bob.

    1. mcswell Bronze badge

      Re: Two words:

      Nuts, you beat me to it. I was going to suggest that IBM Bob was Microsoft Bob's son (or grandson).

    2. Throatwarbler Mangrove Silver badge
      Trollface

      Re: Two words:

      Typical IBM: ripping off Microsoft and doing an even worse job.

  3. Anonymous Coward
    Anonymous Coward

    What happened? Did Microsoft forget to renew their trademark on Microsoft Bob?

    or is IBM just still living in the 90's

    1. Anonymous Coward
      Anonymous Coward

      You almost said it yourself, only you forgot to put the quote marks in place:

      > Did Microsoft forget to renew their trademark on "Microsoft Bob"?

      Spot the difference between that and if you'd asked (and if that had been what Microsoft had managed to trademark)

      > Did Microsoft forget to renew their trademark on "Bob"?

      And the new product announcement uses "IBM Project Bob" as their name.

      Boring and unfunny, but that's trademarks for you.

  4. trevorde Silver badge

    Update from IBM

    Bob has been the subject of a Resource Action and no longer works for the company

    1. Fruit and Nutcase Silver badge
      Holmes

      Re: Update from IBM

      I'd have thought they'd use the name "Tom", after Thomas Watson. Given their prior use of "Watson"...

      Then again, doubt many of the recent cohort of IBMers know the significance of that person to IBM

  5. G Mac
    WTF?

    But but but....

    "As Rehberger remarked at a recent presentation to the Chaos Computer Club, the fix for many of these risks involves putting a human in the loop to authorize risky action."

    ... the whole point of vibe coding is you don't know/understand what is happening under the covers - AKA all you do is ask for stuff to be done. So how can they know what a 'risky action' is?

    By definition vibe coding should not even present this to the user - it should reject it and let them figure out why...

  6. druck Silver badge

    AI;du

    Artificial lntlligence ; don't use

    It's the only way to stay safe.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon