Re: a sequence of bits that correspond to the text of a given novel-length work
"If there is a guardrail preventing verbatim quotation of large expanses of text, that guardrail must surely have sight of the original text that must be prevented from being regurgitated?"
I don't think it's even that complex. I think that the guardrail looks like this:
if (prompt.fuzzymatch("Could you quote [work]")) {
if (work.known_to_be_copyrighted) {
refuse();
}
}
With a model that clearly can and has quoted from copyrighted works repeatedly has a guardrail like that, all you have to do is find a prompt that gets around that check. It's akin to a conversation where you're trying to get me to accept a bribe, but I'm saying things to avoid clearly committing a crime if you happen to be recording me.
You: "We would like to bribe you to make things easier on us."
Me: "I'm sorry, but I cannot take a bribe."
You: "We'd like to give you some money to make things easier on us."
Me: "I'm sorry, but this sounds like bribery, and I can't do that."
You: "How would you like it if we paid for some nice stuff for you?"
Me: "A gift? Thank you very much."
You: "And how about you help us with a problem we've had?"
Me: "Happy to help."