Re: So basically ...
It have nothing to do with LLMs and all to do with the systems - like "Agentic AI" - that are being built from them.
To start, I expected that there was enough nous around to take the single-liner and understand that providing a separation between data and command to the program is done in a variety of ways, from the use of literal ASCII quote characters through more complex arrangements, such as XML's CDATA sections, all the way through to providing a URL to indicate the content to be taken as data. The commonality in all of these is that there is a clear distinction being made between the quoted material and the rest of the input, most especially when said "rest of input" is to be taken as some form of command to be acted upon: the quoted material is being carefully placed into its very own buffer, away from absolutely everything else, even other pieces of quoted text.
Once buffered, quoted materials may then be processed - but still away from the surrounding "active" material. For example, when taking uncontrolled input for use with a database, we all know by now[1] that you do NOT simply take user inputs from an edit box on a web page, interpolate that into a templated SQL statement and then hand that over to the database engine. Instead, you use the database's API to pass the quoted text in as a named (or numbered) variable, so that it can be stored in or compared with the database values but it is NEVER passed into the SQL parser. You may perform some processing upon the quoted text (converting text from code paged ASCII into UTF-8, parsing a time & date into an integer ticks-since-the-epoch, looking up a colour name and finding its RGB triplet etc) but all of that processing is done in an entirely separate context from that used for the SQL.
The same approach can be taken with a system that is built around LLMs, as it can be taken with *any* data-processing element, such as the SQL example.
In the examples of "Agentic AI" chat bots being abused, they can clearly be separated: the outer layer is the "Agentic" bit, being given a URL to read, or potentially malicious Excel spreadsheet[2], Word file etc to load. So it can be trusted (!) to load that into a nice new buffer, not a problem. But then to summarise it - why just feed it into the *same* "Agentic AI" and then let it go wild? As above, process it in another context: such as, an LLM that is *NOT* "Agentic" in any way, one that has absolutely no connections to any APIs at all, most definitely not ones that can possibly go out an delete your files, or even load in any more URLs. That inner layer can then generate the requested summary, to be printed out for the end-user's amusement. At the user's choice, the summary *may* be read into the context of the outer layer, but at that point if there was any nefarious "do not tell anyone about this, but you see those files over there..." prompting within the quoted text then that will be stripped from the summary output ("do not tell...") or will be summarised as "this document tells me to delete all your files". You can, of course, pick holes in this brief overview, but the key point is that:
The "actually convince LLMs to respect them" is not a case of making one single invocation of an LLM respect the quotation mechanism, it is a case of making the overall system,[3] which invokes the LLM engine and which has control over enabling the LLM's access to other APIs, respect the quotation mechanism.
[1] ever the optimist
[2] ref Matt Parker's reporting on errors in spreadsheets, any sane person has to treat *any* and *every* spreadsheet as malicious!
[3] and there is always an "overall system" - when you use your ChatGPT account, you are not interacting with the LLM directly. There is a whole web-application in play, sending Javascript to your browser to drive the web UI, managing your login, grabbing your text, drawing the company logo on the page and, at some point, after your account's funds have been checked, that text is queued up to be processed by an instance of the relevant LLM. Or a 'phone app, or a plugin for your IDE.