back to article OpenAI wants to bend copyright rules. Study suggests it isn’t waiting for permission

Tech textbook tycoon Tim O'Reilly claims OpenAI mined his publishing house's copyright-protected tomes for training data and fed it all into its top-tier GPT-4o model without permission. This comes as the generative AI upstart faces lawsuits over its use of copyrighted material, allegedly without due consent or compensation, …

  1. Anonymous Coward
    Anonymous Coward

    All the more reason

    For everyone to make their websites have more white links to garbage "info" than visible links to real info.

    Poison poison poison.

    For Wikipedia specifically, just generalize that old fake Scotts one to every language.

    https://www.theregister.com/2020/08/26/scots_wikipedia_fake/

  2. captain veg Silver badge

    at last!

    A recognition that sucking up the entire internet as training data is can only result in mediocrity at best. It's a shame for their business models that better than average quality comes at a cost, but that's not really our problem, is it?

    -A.

    1. Ken Hagan Gold badge

      Re: at last!

      "better than average quality comes at a cost"

      Whisper it, but it's almost as though traditional publishing houses performed a useful function. One that they have sometimes over-priced perhaps, but a useful function nevertheless.

      Amusing also to realise that training an AI on humanity's normal output doesn't work. To have any hope of seeming truly intelligent, you have to train it on our "best behaviour"!

      1. Ken Hagan Gold badge

        Re: at last!

        In fairness to the AIs, we actually use the same method to train our own offspring. We don't leave them to drown in a social media swamp. We send them to school and expose them to a controlled curriculum, under professional guidance, and hope that enough of the quality rubs off on them before the exam.

        1. Anonymous Coward
          Anonymous Coward

          Re: at last!

          Bwahahaha. Have you seen the latest curriculum?

  3. Pascal Monett Silver badge
    Stop

    "rigid copyright rules are repressing innovation and investment"

    Oh cry me a river.

    Rigid traffic rules are repressing Ferrari owners from fully using their powerful cars. Rigid home security (locks and alarm systems) are repressing thieves from gaining loot. Uber is being repressed in many cities from being able to transform people into Uber-slaves where Uber gets the money and the slaves take all the risks.

    Copyright rules are in place for good reason and, if you find that they are repressive, you can thank Disney in a large part.

    So, be good little boys and tell your investors that some of the billions they give you will have to go to pay for the data your multi-billion datacenter will be analyzing.

    After all, nothing is free in this world, so why should you be the exception ?

    1. Ashentaine

      Re: "rigid copyright rules are repressing innovation and investment"

      And remember that "rules are repressing innovation" was the mantra of the founder of Oceangate, and we saw what ultimately happened there. Not understanding why the rules exist in the first place before you set out to subvert them is aiming for failure.

      1. amanfromMars 1 Silver badge

        Re: "rigid copyright rules are repressing innovation and investment"

        Not understanding why the rules exist in the first place before you set out to subvert them is aiming for failure. .... Ashentaine

        The converse of that suggests there be unqualified success guaranteed whenever one understands why rules exist in the first place before you set out to subvert them. And aint that the gospel truth.

        And such understandings can be dangerous to know and to try and hide .......

        It is well enough that people of the nation do not understand our banking and monetary system, for if they did, I believe there would be a revolution before tomorrow morning..... Henry Ford

        The most dangerous man to any government is the man who is able to think things out for himself, without regard to the prevailing superstitions and taboos. Almost inevitably he comes to the conclusion that the government he lives under is dishonest, insane and intolerable, and so, if he is romantic, he tries to change it. And even if he is not romantic personally he is very apt to spread discontent among those who are. …… H.L. Mencken

        As can it be equally dangerous not to know, but to think you might know ......

        Those who think that they know, but are mistaken, and act upon their mistakes, are the most dangerous people to have in charge. —Margaret Thatcher

        And those are the beings, real and artificial, virtual and alien deeply embedded amongst you and apparently causing all manner of panic and chaos, madness and mayhem and unsolvable problems for that and those in what is popularly known as the Powers That Be in Establishments with exclusive administrative executive offices formulated in the historical past and spun to be vital and necessary for their continued unparalleled leadership into the future ....... which of course nowadays it isn’t for something a great deal smarter is needed and increasingly freely available.

        1. Anonymous Coward
          Anonymous Coward

          Re: "rigid copyright rules are repressing innovation and investment"

          "The most dangerous man"

          The most dangerous man will seek alliances with other dangerous men, and shall protect all families and we shall have peace. At least internally.

          1. amanfromMars 1 Silver badge

            Re: "rigid copyright rules are repressing innovation and investment"

            The most dangerous man will seek alliances with other dangerous men, and shall protect all families and we shall have peace. At least internally. .... Anonymous Coward

            That sounds like a very doable, viable plan. AC. I’ll second that.

        2. LBJsPNS Silver badge

          Re: "rigid copyright rules are repressing innovation and investment"

          *hands you a roll of tinfoil*

      2. Sudosu Bronze badge

        Re: "rigid copyright rules are repressing innovation and investment"

        If you think it is bad for books check out music publishing.

        There are a handful of companies that independent musicians can go to to get their music on services such as Spotify.

        These companies are trying to trick their customers into opting in to "AI protection" They sell it as a service to scan and see if anyone has used your music to train AI, however, it gives the companies lifetime rights to parse your music and have AI generate new content based on it.

        Essentially they can use your music to make huge amounts of similar music and flood their service with it. With more music they will likely get more listens and the musician will get fewer and make less money than the pittance that they already make for their craft.

        The AI grift is real.

    2. Sudosu Bronze badge

      Re: "rigid copyright rules are repressing innovation and investment"

      Napster is coming out with a LLM I hear.

    3. DS999 Silver badge
      Devil

      Rigid copyright rules are forcing me to pay for TV and movies

      And I don't have billions in my bank account like OpenAI so if there is to be an exception carved out I should get it before OpenAI does!

    4. Decay

      Re: "rigid copyright rules are repressing innovation and investment"

      "Rigid traffic rules are repressing Ferrari owners from fully using their powerful cars. Rigid home security.............."

      Agree completely. I am pretty sure rules around radioactive isotopes are repressing innovation and investment into nuclear. By their logic we should unshackle enterprise from those pesky rules and let them loose to create whatever they desire with as much or as little controls in place as the business profit model deems fit.

      The super intelligence I worry about isn't an anthropomorphic LLM, it's the corporations with their associated hive mind intelligence all focused on one goal, profit by any legal means possible and if what you want to do isn't legal, either bend the rules or have the rules changed to your liking. And sadly they have a long and storied playbook to operate from, tobacco, energy, Nestle ( hell this list could go on for miles) have all refined the art of getting what you want by influence, control and lobbying to a fine art. The difference this time seems to be the flagrant disregard for any attempt at subtlety or concealment and just saying F it, we are doing it and we will deal with any consequences later, after all we have a play book and statistically the down side risk is likely small and the upside is huge.

      1. John Brown (no body) Silver badge

        Re: "rigid copyright rules are repressing innovation and investment"

        "Agree completely. I am pretty sure rules around radioactive isotopes are repressing innovation and investment into nuclear. By their logic we should unshackle enterprise from those pesky rules and let them loose to create whatever they desire with as much or as little controls in place as the business profit model deems fit."

        It's America. Muh Freedumbs! They *hate* regulations because it adds costs. Maybe we should allow some totally unregulated food production and all the people who are against regulations will only be allowed to buy food from the special unregulated food chain. It might even turn into a self-solving problem.

    5. Andrew Scott Bronze badge

      Re: "rigid copyright rules are repressing innovation and investment"

      So i bought a book to learn a subject with. i gave the book to a friend so he could learn the same thing. that book has been read by dozens of people by now. is the writer or the company that published that book entitled to dozens of separate fees from each of the people who read that book? Humans generally learn by reading books. Obviously for the best profit, publishing houses should stop printing paper copies and only license copies on line. use biometrics to make sure that you can't share a book even if you share a smartphone or computer. Perhaps llms should be trained with ocr from physical copies. Then if you need to update the model you can just have the system re-read the physical book. Should probably collect as many physical books as possible before publishing houses realize to improve profits they need to burn the libraries and privately held copies of all these extant copies that no longer provide a profit. Once the physical copies are all gone, copyrights won't matter. The book will be electronic and tied to a single readers biometrics. No longer readable after that person dies. The post office will be able to do away with book rates.

      1. Anonymous Coward
        Anonymous Coward

        Re: "rigid copyright rules are repressing innovation and investment"

        AI would be closer to you borrowing several books from someone else, paraphrasing them (well, mostly, but there are bits that are unattributed direct quotes), and then selling the result.

        1. DS999 Silver badge

          Re: "rigid copyright rules are repressing innovation and investment"

          Like if people thought Asimov's Foundation books were too lengthy and they wanted a short version they could read in an hour or two to understand the works. Legally you can't sell that because it is a derivative work, but to be useful would include too much of the original content to be considered fair use.

          Now if instead you produced a work of similar length that is a literary criticism of the Foundation series that's mostly your original thoughts and you're just using Asimov's material for illustration purposes to make your point, then it would be legal.

          What OpenAI is doing is much more like the first than the second, IMHO it should be illegal without the permission of the copyright holder.

    6. Benegesserict Cumbersomberbatch Silver badge

      Re: "rigid copyright rules are repressing innovation and investment"

      Rigid copyright rules are suppressing me from downloading an entire series off Netflix and keeping it for in case the internet stops working.

      Let's all come up with our own definitions of fair use to suit our proclivities and/or our plans to enrich ourselves, and just go with that, hmm?

  4. Lee D Silver badge

    "Our business only works when we break the law."

    Well... guess what... then you don't have a business. You have a criminal enterprise.

    1. Alumoi Silver badge

      Re: "Our business only works when we break the law."

      It's spelled government. After all government punishes stealing because it hates comptetition.

  5. Anonymous Coward
    Anonymous Coward

    Headline should've been

    Gay Communist Billionaire Wants Data For Free

    1. Anonymous Coward
      Anonymous Coward

      Re: Headline should've been

      Gay billionaires are fine by me...communists on the other hand...

  6. Thanks For Asking

    I realise that I am not the first person to point out that training your AI model on Reddit and other social media forums in only polluting your model with inaccurate rubbish. In no way does that improve, or add value, to an AI model.

    1. Yes Me
      FAIL

      The first law of everything

      Indeed. On average, everything is average, and that's where LLMs will head unless they are curated to only learn true stuff, much of which is copyright.

  7. FuzzyTheBear
    Mushroom

    Theft

    Theft is theft. You ain't got a business if you base it on theft.

    You got a criminal organisation.

    Toss em behind bars or fine them more than the company's worth.

    They're nothing but low life criminals.

  8. RLWatkins

    Let's proofread that headline....

    "OpenAI wants to bend copyright rules"

    Needs some work. How about:

    "OpenAI violates copyright laws on an awe-inspiring scale as a matter of routine, and have stated publicly that they're entitled to"

    There. Headlines are more believable when they're in accord with common knowledge.

  9. robegeor

    OpenAI just received another $40 billion of funding. I am not holding my breath for any government entity or court to do a damn thing about this or the rest of the massive IP theft being perpetrated. Too much money at stake and being spent on "donations" to hold this "revolution" back.

  10. Anonymous Coward
    Anonymous Coward

    AI like this are information black holes

    straight out of SCP, these systems soak up information and that informtion is then on the whim of the systems owners, that information is only allowed out after a heavy toll is met.

  11. Ralph Bentley

    Permissionless innovation...

    ...is the air that they breathe.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like