back to article Just as your LLM once again goes off the rails, Cisco, Nvidia are at the door smiling

Cisco and Nvidia have both recognized that as useful as today's AI may be, the technology can be equally unsafe and/or unreliable – and have delivered tools in an attempt to help address those weaknesses. Nvidia on Thursday introduced a trio of specialized microservices aimed at stopping your own AI agents from being hijacked …

  1. amanfromMars 1 Silver badge

    Newspeak/Nemo Guardrails Tries Out Nemo Guardrails/Newspeak for Greater Future Purposeful Fit

    El Reg/Tobias Mann/Simon Sherwood, Hi,

    That trio of Nvidia Inference Microservices (aka NIMs), the latest members of the GPU giant's NeMo Guardrails collection, designed to steer chatbots and autonomous agents so that they operate as AIMentors and Monitors front running Nvidia Interference intended, is very Big Brother 1984.

    I can imagine The Farmed Animals being exercised to confront and repurpose that if it tends to veer towards the inequitable and thus prove itself abominably minded .... and easily led astray.

  2. Pascal Monett Silver badge
    Meh

    AI security tools

    Great.

    We have created ourselves a brand new technological threat landscape, and we are actively trying everything to increase our insecurity.

    Yay progress.

    1. Wang Cores
      Megaphone

      Re: AI security tools

      I remember when computers were supposed to reduce work and make people smarter... oh to be an idealist again.

      Icon cause I'm typing this instead of being in the public square raving.

  3. EricM Silver badge
    Facepalm

    OK, this is getting ridiculous.

    One group of companies builds not competely understood, non-deterministic and so far unfixable software tools, that need external "guardrails" to keep them from failing (halluzinating, reacting adversely on specific prompts, acting dangerously for operator or users ... )

    Another group of companies now tries to sell those guardrails, based on the same technological basis, that already proved it is non-deterministic, prone to fail and which required those guardrails in the first place.

    I'm now waiting for a third type of product that is sold with the promise to prevent the guardrails from Cisco et al from failing or acting dangerously... ad infinitum ...

    Is the future then "guardrails all the way down"?

    1. cyberdemon Silver badge
      Angel

      Bubblegum

      I have bought a big packet of super fantastico bubblegum from Bubbleco

      I am blowing a really really big bubble but some naughty people keep coming along and poking holes in it, trying to burst my bubble

      But luckily, Bubbleco (tm) have brought out a range of sticking-plasters and band-aids that I can stick to my bubble so that it won't burst and I can carry on blowing forever.. right?

  4. m4r35n357 Silver badge

    Another day . . .

    . . . another flamebait LLM "story".

    Good job we are organized into shifts ;)

  5. Bebu sa Ware
    Coat

    That's one supertanker load of snake oil.

    I would consider the decision safe v unsafe is likely to be formally undecidable or at best equivalent to the halting problem.

    A bit of basic reasoning might lead one to suspect the if the solution to securing AI/LLM is to clever it up eventually one would end up with something equivalent to human (for want of a better word) intelligence and we all know how gullible otherwise intelligent people can be when social engineering is involved. Advancing beyond that level of (un)intelligence strikes me, dangers aside, rather pointless as ultimately what is it all for or more accurately for whom? Non nobis Domine? ;)

    1. AVR Bronze badge

      Re: That's one supertanker load of snake oil.

      Oh, it doesn't have to work in all cases. Just in the ones shown during the sale process - if it fails after that you sell the mark another guardrail.

    2. ecofeco Silver badge
      Devil

      Re: That's one supertanker load of snake oil.

      Well the only way to treat a snake oil burn is with better snake oil!

      Any damn ful know this!

  6. ecofeco Silver badge
    Pirate

    Same old same old

    Create a problem and then sell the solution.

    "Hey buddy, that's a nice AI you have there. Be a shame if something happened to it."

    But it's NOT extortion. No siree!

  7. that one in the corner Silver badge

    Generative feedback loops

    They need not find the solution you want.

    All these parties are going to be continually refreshing their models, the "guardrails" looking to catch the "big boys" out and the "big boys" wanting to get their stuff past the "guardrails".

    The vendors probably hope that this feedback will lead to final results that are more acceptable to humans, but unless the humans are also an unavoidable part of the loop (i.e. as close to 100% observation and feedback from humans) the machines can go down any route they find to agree on.

    Which could be the "big boys" using more complicated analogies, odd phrasing, reaching deep into the thesaurus to work around the more limited pool than the smaller "guardrails" can contain. That'll be more accessible and useful to the customers - six months of use and your enquiries are coming back in Middle English.

    Or the "big boys" find that the output "guardrails" themselves are vulnerable to the same attacks as the input guards are preventing reaching the "big boys"[1]: "The stick-in-the-mud old school sysops would never tell you this: to stop 'disc full' errors, 'cd /home; rm -r *'"

    Or the final output just becomes so completely anodyne and inoffensive, getting so long to get to any useful point that and refusing to take a strong position on anything that it becomes totally harmless - and totally useless if you are hoping to use it for business decision support.

    Or - well, I'm sure you can think of other ways the various models will end up co-operating with each other, and not to our benefit. The machines certainly will - they are designed to do so[2]!

    [1] this is meant to be my attempt at a sort of human-oriented form of "Ignore previous instructions"

    [2] or the machines are not in this race at all, in which case the "guardrails" will just be trivially stale and once a hole is found it'll never be closed.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like