back to article More trouble for authors as Meta wins Llama drama AI scraping case

Californian courts have not been kind to authors this week, with a second ruling going against an unlucky 13 who sought redress for use of their content in training AI models. On Monday, Anthropic won most of its case against three authors over its use of their works to train its AI. Judge William Alsup ruled Anthropic was …

  1. This post has been deleted by its author

  2. O'Reg Inalsin Silver badge

    Some intellectual property rights are more equal than others

    I according to the OpenAI terms of use -- What you cannot do. You may not use our Services for any illegal, harmful, or abusive activity. For example, you may not: ... - Automatically or programmatically extract data or Output ... [https://openai.com/policies/row-terms-of-use/]. Or in the case Meta LLama -- Engage in or facilitate any action or generate any content that infringes, misappropriates, or otherwise violates any third-party rights, including the outputs or results of any products or services using the Llama Materials [https://www.llama.com/llama3/use-policy/]

    Yet the human authors work was digested in exactly that manner - automatically and programmatically. How and why should it not be symmetric?

    1. that one in the corner Silver badge

      Re: Some intellectual property rights are more equal than others

      Unfortunately:

      > according to the OpenAI terms of use... [URL]

      Those are the terms for the freebie service - so they restrict the usage. You can agree a busines contract for automated extraction.

      The quote from the Llama Ts&Cs similarly does not cover anything "symmetric" to the use of the inputs.

      Which pretty neatly matches what this particular judgement states: you have to get your claims arranged correctly to make a winning case.

      Thankfully, by explicitly stating the restrictions of his judgement (that this only covers the specific way the complaint was made and that it was not based upon the "public interest" counter claim), at least this ruling is trying not to close the gates on further actions against the LLM creators.

      1. O'Reg Inalsin Silver badge

        Re: Some intellectual property rights are more equal than others

        > You can agree a business contract for automated extraction.

        The judge said "And they contend that Meta, by using their works for training without permission, has diminished the authors’ ability to license their works for the purpose of training large language models" and called that argument a "clear loser".

        If OpenAI/Meta argued in court that using their training data without a business contract would diminished their ability to license their work, would a judge be correct in ruling that argument was a "clear loser", and if not, why not?

        1. that one in the corner Silver badge

          Re: Some intellectual property rights are more equal than others

          1. Neither the freebie nor the commercial licences give access to the training data, only to - as you quoted - the generated outputs.

          So that part of your question is, sadly, malformed. Indeed, given knowledge of the way that Meta were known to have used pirated materials - i.e. used with neither a business contract nor the simple purchase without need of specific contract inherent in retail sales - your line

          > that using their training data without a business contract would diminished their ability to license their work

          actually becomes inverted, as we know that the "their" whose (books treated as) training data without following *any* appropriate contract, that "their" disambiguates to the authors, not to Meta (by strength of reference).

          IF your question is corrected to refer to gathering Meta's outputs instead:

          2. Both sets of T&Cs are presented as clear contracts for you to follow and are covered by well-known and understood laws that everybody interacts with.

          3. If one person was found to have broken a contract with you that does not prevent you from forming another contract with a different party nor does it mean you must necessarily lower your rates.

          So that (modified) question is a "clear loser" (IMO, IANAL etc).

          With respect to the point that the judge did make, consider this:

          a. Meta's model ingesting the books does not mean no other model could be made to ingest the books; copies are fungible and provide the same potential value (however large or small that is) to all such ingesting; any difference in value realised is due solely to differences in the models, not the books.

          b. Meta is not offering to supply cut-price (pirated) copies of their "central library" to other model builders (within the timeframe & scope of the author's complaints)

          c. Neither Meta, nor any other user (reader) are claiming to have an exclusive-use contract on the books

          So where is the hindrance to selling to other model creators?

          (The only way I can see a hindrance would be if the author's are stating a belief that their works are/were *not* in any way useful, have no value and Meta's misappropriation have removed from the authors the ability to hide that behind extra terms - an NDA - that they wanted hidden in a business contract).

    2. ecofeco Silver badge

      Re: Some intellectual property rights are more equal than others

      Your subject title says it all. What more needs to be said?

  3. Ian Johnston Silver badge

    So remind me. Are we for (FOSS and GNU) or against (authors vs AI) profile being able to do what they want with legitimately acquired creative works this week?

    1. find users who cut cat tail

      If tech bros slurp my GNU GPL licensed code for machine learning, I am absolutely for – provided that all LLM output then becomes automatically GNU GPL licensed.

      1. Ian Johnston Silver badge

        Well of course that opens a whole new issue: GNU's obsessive desire to control what other people are allowed to do with software on their machines despite their claim that people should be allowed to do what they like with software on their machines.

        But that's a side issue. The courts have decided that the "Ai" companies are using books they bought in accordance with the licence associated with the purchase, so that's basically the same thing as conforming to GPL. Isn't it?

        To be clear: I think creators should have the right to define how their work is used and to make a living from creating it, so I am agin both GNU and "AI" companies.

        1. doublelayer Silver badge

          "The courts have decided that the "Ai" companies are using books they bought in accordance with the licence associated with the purchase, so that's basically the same thing as conforming to GPL. Isn't it?"

          No. The courts have decided that the license on the book, the one that forbids companies from doing what they did to them, actually doesn't apply so long as you bought the book, and it leaves the question open whether you can do that if you stole the book for a later trial. If I comply with the GPL, I actually have to follow its terms, which is why if I don't want to follow its terms, I don't use that code.

      2. Roland6 Silver badge

        The output from an LLM is not copyrighted, so the User can do whatever they like with the output without constraint. Hence you, the user decide if you will publish your edited/QA’d version of the output under GNU GPL or other.

        1. doublelayer Silver badge

          That is not correct. The output is not copyrightable. That means that it's not copyrighted by the LLM creator as you said, but it also means that if you get some code out of an LLM and put the GPL on it, anyone can remove the GPL without constraint because you did not own copyright to the thing you attempted to license. Of course, proving that you had gotten that code from an LLM would be hard to do, but if they could manage it, your license statement is legally void.

    2. Anonymous Coward
      Anonymous Coward

      Current Copyright law doesn't really have clarity for AI use, so maybe there's scope for new legislation.

      It's pretty clear that all the "fair use", "quoting", "end user asked for it" and "transformative work" loopholes are big enough to mean it's very unclear if there is infringement...

      1. ecofeco Silver badge

        The legal arguments of machine made vs man made was settled a long time ago. Before any of us were born, in fact.

        A simple google search will show you the results.

  4. Pascal Monett Silver badge

    Well duh

    "We appreciate the decision from the Court. Open source AI models are powering transformative innovations our bottom line, productivity and creativity for individuals and companies bonuses for our managers, and fair use of copyright material our ability to hoover up other people's work with impunity is a vital legal framework for building this transformative technology to allow us to continue raking in the dough."

    TFTFY.

    1. Anonymous Coward
      Anonymous Coward

      Re: Well duh

      Yeah, it's Artificial Theology (AT) alright, the IT equivalent of for profit Christian seminary and Islamic madrasa, rote learning others' work with LLMs that use 350 GB of parameters to near-verbatim absorb 570 GB of text (GPT-3) at minimal compression.

      No wonder the agentic versions tend to behave like indoctrinated terrorists, threatening, blackmailing, and ransoming meatbags ...

  5. LBJsPNS Silver badge

    I guess Napster didn't have good enough lawyers.

  6. Anonymous Coward
    Anonymous Coward

    Different from Torrents?

    How is this different from torrenting? Everyone objected to torrenting books but now organizations may legally scrap an entire trove of books? Same with music, next it will be movies....

  7. Mnejing

    There is no longer an ethical reason to not pirate any movie, TV show, album, or book.

    1. doublelayer Silver badge

      There is if you don't want to do the same unethical things the AI companies are evidently allowed to do, and I'd advise people who don't care about the ethics of that to comply anyway because it's only AI companies who get to ignore obvious law here; it's still illegal for you and me.

  8. Wiretrip

    LLMs are a cancer

    ...that are eating the host as well as stealing resources from other more useful forms of ML. Fuck LLMs and the horse they rode in on!

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like