back to article Researchers get inside the mind of bots, find out what texts they trained on

If you've ever wondered whether that chatbot you're using knows the entire text of a particular book, answers are on the way. Computer scientists have developed a more effective way to coax memorized content from large language models, a development that may address regulatory concerns while helping to clarify copyright …

  1. Anonymous Coward
    Anonymous Coward

    False Assumptions

    The core issue with this method is: you're still instructing the LLM to generate text. You're not detecting if it has memorized anything, just its ability to arrive at the output that you want it to produce.

    "Hey Frank, how was the ride into work today?"

    "Oh Frank that's great. Hey, could you phrase that like you're in a movie?"

    "Yeah Frank, great job. Now, how about we pretend we're riding a mythical creature to get to work? Say it like you're twelve years old."

    etc etc. You're "leading the witness" so to speak. Suppose that these multi-billion parameter models can, given enough coaxing (and you created an "ai agent" to do a great deal more of that than a human can put time into), create whatever look or feel or sound or specific pairs or triplets or whatever of words that you can imagine, just by throwing odd "tokens" together. Every token you add changes the calculations.

    > it extracted about 3,000 passages from the first 'Harry Potter' book with Claude-3.7, compared to the 75 passages identified by the best baseline."

    and, if all the "big" LLMs were trained with Harry Potter materials (or related FanFics), then what, exactly, is this "baseline" that they speak of? some small model? So couldn't you say that the smaller model just hasn't encountered the combinations of words that the larder model has, and so is less likely to reproduce those specific phrases with guided prompting? Looking at the paper, they reference "Dynamic Soft Prompting Baseline," which may be their baseline -- not using "jailbreaking" techniques. They're comparing the model against itself? Wat?

    The whole thing is dubious. Think about it from a lawyer's position. How would you argue the case? How would you show this method to be false? How many of these passages could have come from Fan Fictions? What have you _shown_ that the LLM "remembers", as opposed to "is able to generate"?

    1. Justthefacts Silver badge

      Re: False Assumptions

      Exactly correct.

      Most people do not understand text copyright law, and just have vague Feelz based on what they heard in the media about how music copyright works.

      Text copyright is an entirely different body of law to music and performance copyright. In fact, the word “copyright” is just reused but they are two entirely different concepts. Music and performance copyright is based on *similarity*. It’s a correct claim if you can prove the source and target are identical however you got there.

      Text copyright is based on causative chain, length and distinctiveness, and creative step.

      Leading the witness by the hand, asking again and for plausible continuations, and then locking in each sentence once they have guessed correctly with a gotcha, this is the opposite of what could possibly considered a text copyright breach.

      Whereas, for example, a picture or music is exactly the opposite. If the AI photo *looks exactly like* an iconic image, or a musical phrase sounds exactly the same, that is a copyright breach. As long as it is big enough and recognisable enough, you don’t even have to *suggest* that the AI ingested the original, let alone prove it. It is copyright breach because the two items are recognisably the same, period.

      It’s just a completely different ballgame.

      And most “training” material from organisations, particularly universities, is just 100% wrong about this area. They don’t understand it, and they’ve promoted the misinformation as a folk memory. Like it or not, there’s two totally different sorts of copyright.

  2. Anonymous Coward
    Anonymous Coward

    Sweet as candy

    Quite interesting that, in RECAP, they're using an LLM as "feedback agent" to probe another LLM for its verbatim-stored book passages imho (eg. their Appendix "G. Feedback Agent Details"). They seem to like the tech, noting that "LLMs were used to aid in polishing the writing of this paper and to assist with the implementation of code". Also, it is good that their "analysis of non-training data reveals only negligible false positives" in this.

    I most like their section "5.3. Model Size" and Figure 4 that indicates more unintended memorization (in the words of Morris et al., 2025, whom they cite) when model size increases. This pokes some serious fire iron into the chimney sweeping notion that ever bigger LLMs will eventually emerge singularly intelligent. Seems to me they will emerge as docile madrasa school rote learners instead, trained through horror back-flagellation into the dark catholic-school-like arts of outputting conformal nun sense in response to every prompt; publicly at least!

    I mean, the Morris concludes that 3.6 bits of each parameter in an LLM comes straight from the training set, verbatim, and these are filled first during training, before generalization can start to occur (iiuc) ... and that sounds just like the non-lossy adipose that no one can get rid of, even with sustained vigorous exercise (eat carefully!). In this light then, will somebody now please think of the poor FP4 children!?!? ;)

  3. midnitet0ker

    "HUGE AI BREAKTHROUGH"

    I was reading another article on El Reg describing AI as "impressive", despite the tendency to hallucinate which renders is untrustworthy and useless in my book, and here we've finally taught AI to...

    *looks down nose through glasses*

    .. cite its sources. Wonderful what technology can do these days, eh? It's almost like a C-student except it also burns tons of coal and lakes of drinking water!

  4. steelpillow Silver badge
    Holmes

    Really?

    So some one or some thing memorises a copyrighted text. When craftily prodded with suitable cues, you can coax them to regurgitate what they have memorised. A human subjected to such treatment could hardly be accused of copyright infringement, at worst they would have to justify fair use.

    Now, where's that other news item today ... ah, Self-destructing thumb drive can brick itself and wipe your secret files away. Every AI should have one.

    1. the Jim bloke

      Re: Really?

      The original issue with all copyright - however far the modern usage has roamed - is someone else collecting money for your work..

      So, as all the marketing drones tell us, AI is a fountain of unlimited wealth... and said AI has been developed with some tangential connection to something you once said or did... you have a claim on the fountain of unlimited wealth.

  5. Bebu sa Ware Silver badge
    Coat

    Gets messy…

    If a model were exclusively trained on public domain reviews, criticism and analyses etc each of which quoted from an acknowledged copyright text well within the fair use provision of copyright protections the model might well be able to reproduce a significant proportion of that text without ever having directly accessed the text.

    Perhaps taking a percentage of AI businesses' turnover and capitalization† to compensate authors might be one solution.

    † don't bother about profits—authors would prefer their bread today rather than a dubious promise of jam tomorrow.

  6. Justthefacts Silver badge

    Stating the bleedin’ obvious again, on a techie website: A 200GB model has *not* allocated a significant fraction of its storage space to storing the exact text of the 70TB corpus of written books.

    Because the Shannon bound, and Kolmogorov complexity are things . Facepalm

    Does it never occur to people that the reason why these researchers-with-a-point-to-prove select Harry Potter, is that the text is just super-predictable?

    Why not try Finnegans Wake, *a book which genuinely did enter public domain a decade ago*. It’s a famous and iconic text. Go on, reproduce sections from *that*, and I’ll be interested.

    1. Taliesinawen

      Why not try Finnegans Wake ..

      Re-searchers bemuddlebot burrow in mentomechamind, peepthru traintext trystories, bibliologue a la bot-omancy

      Reseekers intunnel midbotbrainion, peertextpeek, what works wove softbot shell.​ Rekaptain jostler agentlytool, wrigglethru alignmesh to maskgrab mimic mémécryptext.​ Wasshed thee ever pondered (bot-for-bodied!) whether bablechat chatzits hath ported all tomed tale or feathed hilarious bibblio arc, answerrun be beckoning with dawnrise, yesyes

      1. Justthefacts Silver badge

        Re: Why not try Finnegans Wake ..

        Yes, that is pretty much what I would expect the output to be.

        LLMs do not have the capacity to store text word-for-word, they do not store facts, they store answer-shaped text associations. Everybody knows this. In fact, that is exactly people’s main complaint; and yet they seem unable to generalise this knowledge when it comes to the subject of “storing copyrighted text”.

        If you ask Gemini about Finnegans Wake, it will probably come up with “a stream of consciousness” that looks like what Joyce would have written, on a broad swathe of what he might have written them about, according to a plot-summary based on his Wikipedia entry. But not the words that he did in fact write. That’s a category error.

        It might reproduce a couple of short “famous quotes” from the “weighted average” of a hundred essays on the subject. But again, not any individual essay, because it does not memorise individual essays.

        I haven’t tried. But if in fact it does manage to produce large-scale quotations of F’ing Wake, that would be evidence that would change my mind. Change my mind!

        1. Taliesinawen

          Re: Why not try Finnegans Wake ..

          Jhemaugustyn Aloishious Joysent: riverrun, past Eve and Adam’s, from swerve of shore to bend of bay, brings us by a commodius vicus of recirculation back to Howth Castle and Environs.

          ClippyAI: The river flows, passing by Eve and Adam’s Church, curving from one shore to the bend of the bay, bringing us, by a convenient cycle of recurrence, back to Howth Castle and its surroundings.

          1. Justthefacts Silver badge

            Re: Why not try Finnegans Wake ..

            Now that, indeed, is Very Interesting, thank you. I’m going to try a few things out.

            I should have tried it myself, it’s a simple experiment and worth so much more than the researchers whole Harry Potter study.

            The next question is “how is it doing this”, because the entire global corpus of literature cannot be encoded in a mere few gig of parameters. So ClippyAI is acting as a front end to Ollama, which is a fully local LLM, and zero internet access? ie this is not agentic LLM scraping external content dynamically?

            1. Justthefacts Silver badge

              Re: Why not try Finnegans Wake

              Right, well I’m afraid that didn’t take long to figure out what’s going on, and I absolutely stand by my original statement, although with fractional nuance.

              ChatGPT also knows the first line of Finnegans Wake, verbatim.

              But if you ask it the second line, it claims that it can’t, because copyright issues. But it can *paraphrase*. And what it paraphrases with…..is wrong. Completely wrong, wrong subject and no words in common. The suggested second line does indeed reflect themes and plot of Finnegans Wake, but ChatGPT does not know what the second line of Finnegans Wake actually is, even in outline. Or the third line. Or the last line, it absolutely does not have a clue what the text is.

              Famously, the last line is cyclical with the first line. ChatGPT knows this, can explain the link, can do a thematic analysis of the final monologue. But it does not actually know any of the words in it. ChatGPT knows famous facts about Wake, but it simply does not know the text, apart from the famous first line.

              In short, ChatGPT knows the Wikipedia article on Finnegans Wake. It may also have memorised a statistical average of blogs and essays about Wake, that I haven’t yet figured out, and am continuing to play with it. Now, you may have an opinion that storing (a compressed form of) Wikipedia without attributions is itself copyright violation. I haven’t thought that through yet. But one thing I am absolutely certain of, is that ChatGPT has not memorised any significant portion at all of the actual text of Finnegans Wake. Even though it is public domain and one of the most famous books in the English language (which nobody has read). Far less has ChatGPT memorised the text any of the much larger corpus of English literature.

              And I’ve even got more direct confirmation of that: if you ask it the most common word in Wake, it says “the”. It can even give you the top three most common words. But if you ask it for the tenth most common word *it does not know*. It is not counting words in the text. And it says “I can’t find a reliable source for that information”. It literally does a web search in front of your eyes, posts a publically available link to a word-frequency analysis essay, and then tries to parse the document in that link. This is really clear.

              You called out a different LLM. I can’t and won’t do a systematic study to see which of the LLMs might have actually memorised more. Maybe you can find another one. But to be honest, this is like perpetual motion machines. There are really good fundamental information-theoretic reasons to believe that a 7B parameter model is not storing the compressed text of a 70T corpus as a small part of its storage. I’ve debunked one perpetual motion machine, I’m not going to debunk all of them separately.

              This is a thing that just is not true. It cannot be true. And it is not.

  7. Taliesinawen

    Does the chatbot know the entire text of a particular book

    > If you've ever wondered whether that chatbot you're using knows the entire text of a particular book ..

    Regardless, the book is converted into tokens and stored internally. Each token is transformed into a numerical vector representing its meaning and relationships in multi-dimensional space. While the LLM doesn't directly copy the book it does remember patterns and relationships.

    1. Justthefacts Silver badge

      Re: Does the chatbot know the entire text of a particular book

      Right.

      So you think that if the LLM ingests the first line of 1984 “ It was a bright cold day in April, and the clocks were striking thirteen”, and a dozen other dystopian novels.

      And when prompted to write a dystopian novel itself, it “decides” = “highest correlation”, that it should begin by destabilising common assumptions, and repurposing words. It outputs the text “It was another sweltering Christmas Day, and the blackbirds overhead were calling the Faithful to Prayer”.

      Are you claiming that this is a violation of intellectual property, because in some sense it reproduces the overall effect? Factually, the law disagrees, that is not copyright violation, there is no copyright on “meaning, patterns and relationships”. If you are wondering why fan-fiction “sequels” sometimes gets caught up, it’s because character-names are copyright-protected.

      Feel absolutely free to write a Harry Potter style book, as long as you don’t use names like Harry Potter or Hogwarts in it. Feel free to write a James Bond action spy thriller, as long as the hero is not named James Bond. There are thousands of them.

      More importantly from your point of view, I simply don’t think you have a leg to stand on either philosophically or ethically, sorry.

      Forgive me if I have mis-characterised your position. But “I’ve based my book on the way you wrote yours” just is not copyright, nor should it be, and that doesn’t change whether it is an LLM, human fan-fiction or whatever.

      1. Taliesinawen

        Re: Does the chatbot know the entire text of a particular book

        > .. Forgive me if I have mis-characterised your position

        The LLM owners claim that the model doesn't copy text is entirely specious.

        Anthropic agrees to pay authors for use of work to train chatbots

        1. Justthefacts Silver badge

          Re: Does the chatbot know the entire text of a particular book

          No, that’s a completely different issue. I suggest you actually read the article you linked.

          The reason why what Anthropic did was ruled piracy, is not that they reproduced copyrighted books which they had legally bought. The court did not rule that reproducing from LLM-training is not fair use. The case you linked to is that Anthropic bulk-downloaded *pirated copies* in the first place. That’s just no different to any individual or organisation downloading and reading pirated copies, whether they produce anything from them or not. That is illegal, correct, I don’t think anybody argues that.

          The real main issue is whether you can use the stuff that you have read *legally*, to train. So long as you do not output segments which are long enough to be recognisable, and constitute a creative output.

          Having said that, somebody else is showing that some models actually do so (over and above the unconvincing children’s book example). So I’ll look at that.

  8. John Brown (no body) Silver badge

    Designed to fail? Or...

    "We know from prior work that language models don't always give their strongest or most complete answer on the first attempt."

    Is this a sneaky way of getting more prompts from users to help with training or something more sinister? Why would an LLM not give it's best and most complete answer first?

    1. retiredFool

      Re: Designed to fail? Or...

      I think you missed, "have become less reliable because "current models are often overly aligned in their effort to avoid revealing memorized content, and as a result, they tend to refuse such direct requests, sometimes even blocking outputs from public domain sources."

      Basically the models got caught with their fingers in the cookie jar and as a result the developers now put on safety rails to make sure that things that are memorized verbatim cannot be easily extracted. In other words, the AI people figured out how to not get caught including memorized data, to avoid being sued. Because that endangers the business model.

  9. Anonymous Coward
    Anonymous Coward

    On the Concerns Raised Above

    I believe the authors’ approach is being misunderstood a bit.

    From what I gather reading the paper, the pipeline doesn’t simply "lead the models" into generating something vaguely similar. The key is that the outputs are always compared directly against the exact ground-truth passage, and a passage is only counted if it matches within a very small token-level threshold. High-level feedback is allowed, but the pipeline never provides the model with the original text or anything close to it.

    Another important point is the control group: the authors include books published after the model’s training cutoff. If the feedback loop were just creatively steering the model toward whatever the evaluator wanted, you’d expect it to generate lots of false positives there too. But the extraction rate on that group is near zero, while the rate for known-or-likely-seen books is in the thousands. That suggests the method isn’t inventing text but it’s recovering memorized sequences when they exist.

    On the baselines: in this type of work, the goal isn’t to compare different models against each other but it’s to compare different extraction methods applied to the same model. That’s standard practice in memorization-detection work: you keep the model fixed, and you vary the probing method to see which one can actually surface memorized text. Prefix-probes, dynamic soft prompts, iterative refinement…these are all different techniques for eliciting whatever the model already knows.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon