"That principle does not fall away simply because a company adorns its infringement with the words 'AI.'"
Almost poetry in its own right.
A trio of music publishers has sued AI outfit Anthropic for slurping up song lyrics without asking for permission as it trains the Claude chatbot. In a lawsuit [PDF] filed in the US District Court for the Middle District of Tennessee - home to music hotspot Nashville - publishers Universal Music, Concord and ABKCO accused …
However, learning from it and then giving your own song in a similar style with better (maybe) or worse (maybe) lyrics is absolutely fine. The principle of learning from copyright material is what we all do, all the time. To make it copyright compliant, they just need to tweak the algorithms to make sure they never return the learning material itself (unless it is public domain, of course, in which case you can do literally anything you like with it, including claim it as your own if you want).
If that wasn't the case, then almost every love song today could be deemed a breach of someone else's copyright.
Unless they bought a copy of all the lyrics in the first place, properly licensed for their use (e.g not "home use"), then they "stole" them.
Reading (scraping) them from a site that had licensed them for the public to see, is absolutely certain not to be a license to then use them for any commercial purpose whatsoever.
Robots.txt says nothing of the sort. The Robots Exclusion Protocol says what you can’t do not what you can do. Bots can access a URI if there is no rule in the Robots.txt to disallow it, or no Robots.txt. So, the Robots.txt does not say you can it's actually saying nothing.
Robots.txt does not even say, it asks bots not to access DPIs which most scrapers just ignore.
But none of the above matters as Robots.txt is in no way shape or form a legal document, it is not a contract giving you permission to copy and use copyrighted material in any way you like. The owner of copyrighted material has not surrendered their rights of ownership just because that material is on the internet and therefore accessible to the public.
Public libraries are accessible to the public but copying a book from the library that is still in copyright would be copyright theft.
And pay for the licenses to use the copyrighted material in their training model or remove it all.
The problem is, theoretically, if you listen to music, you have paid for a license to listen to it - either via a streaming service or buying a license/CD/LP etc. Obviously, there are the black sheep out there, who download the music illegally, but that is the same argument as the AI using unlicensed music for training.
Hmm, without wishing to detract from your point, exceptions to that are if you are in a shop, or Music On Hold on telephone services where the shopkeeper or business owner will (should?) have a licence with the Performing Right Society (PRS), unless it is copyright free or in the public domain.
So maybe AI is sucking all this content in whilst at the hairdressers.
Giving back lyrics with proper attribution might be fine
No. Absolutely not. Not ever.
You might as well say it would be okay to play back a feature film, or output a full newly released book. Absolutely not allowed.
It's usually fine to show a small snippet, condense/summarize, list facts about it, or offer your imprecise recollection of it, but otherwise, just no, not allowed and not fine. Certainly not in a for-profit commercial setting with no way to claim it's for news, educational, or commentary purposes.
...you got the nail on the head.
If you ask for an AI to reproduce a song, that's on you. No different than if you ask Google to search for it. They probably searched thousands of queries to find one that "sort-of" came back as a pre-existing song. So the people infringing copyright are the people looking for "existing works", not the AI.
If you use the output of an AI, it's on you to check the copyright before you you publish it. Just like if you Google it, or look it up in a library.
Otherwise, this is a very small part of what the AI model may output. Small enough that it's going to be a quote. Like if I were to type "Rage against the machine". It is a quote, in context and therefore not copyright infringement. I don't attribute it - I don't even know where it comes from.
Having largely vanished in a tsunami of public hatred for suing kids over music downloads, copyright lawyers can now climb back upon their unicorns and charge into battle against AI, the ultimate legitimate target. After all, governments and scientists have both declared AI to be a threat to civilisation and humanity. Who will save us? Copyright lawyers.
I hope some of you will be naming your children after them. With permission, of course.
Sometimes when searching for lyrics I encounter 'lyrics meanings' which look very suspiciously like AI generated them.
Have a search in particular for "The Meaning Behind The Song" "audrey key" and let me know what you think, 'she' seems to churn out a lot of them...
I suppose that's one way to avoid a lawsuit as alluding to the lyrics doesn't step on anyone's toes.
(What started me on that path was someone donated a compilation album by unknown artists with a bluesy tinge into Oxfam Music last week. One track sounded really familiar. Spent a few days with it in my mind and then it suddenly clicked: it was a rip-off of Broken Land by The Adventures).
> even if forced to pay [the $75 million] in full Anthropic could probably afford it - the company was valued at around $5 billion
Yes, but that'd just be the fine for the original copyright violations, not a license for them to continue doing so. That would no doubt lead to them being sued again- and again, and again- and likely having to pay greater and more punitive damages.
However much they have in the bank, it's not going to last long in the face of that (though, of course, investors wouldn't tolerate it going that far in the first place).
And if that $5 billion valuation had been based on the assumption that they'd get away with their current business model and practices, then it raises the question of how much the company is worth if they can't.
Continuing reliance upon rentier economics is generating bodies of statute and case law adding to the already ramshackle structure prior to the digital era. Law concerning so-called 'intellectual property' (IP) always increases in complexity and in its reach, this impacting upon the workings of nations, companies, and individuals. Long since has this corpus of law become impenetrable to everyone other than those profiting from its enforcement. Almost all other civil and criminal law can be grasped sufficiently by the ordinary citizen. Bear in mind, despite appearance to the contrary, law is meant to guide citizens regarding acceptable conduct.
Complexity getting out of hand is indicative of increasing lack of clarity about purpose and definitions. It also reflects escalating difficulty of enforcement, this particularly upon introduction of technologies not envisaged when the specious notion of ideas being property in a sense similar to that referred to in the 'Ten Commandments' was introduced.
Ill-conceived law, that which is overly difficult to comprehend, that which is regularly flouted by many who deem it silly and too restrictive, and that which is steadily becoming impossible to enforce, is by definition 'bad law': it is there to be circumvented, ignored, and ridiculed. The many allegedly 'creative' people and the corporations profiting from their efforts are ignoring harsh realities which will shatter their overweening senses of entitlement. Meanwhile, the truly innovative shall be exploring other means for gaining recognition and income. Ironically, there shall be a return to the cottage industries in the days of the original Luddites. Huge, often conglomerate and transnational, rentier pseudo-enterprises shall go the way of the dinosaurs; a host of middlemen shall have to find more constructive uses for their energies.
The failure to supply supply references when it be appropriate, or even worse make up references, is an abominable defect in all of the current AI models. References are also a solid connection to the "real world", something AI sorely needs to ground with. Hopefully ChatGPT will grasp the silver lining of this pocketbook pain (even if OpenAI wins lawyers are expensive) and learn, or be taught, how to do it. Here's a twist on Don Maclean's lyrics to his song "American Pie". That's all it takes! It adds value - the reader has learned something interesting if they weren't aware of that song - they may even investigate it further.
Yes it requires more structure than what the AI models are imbued with now, and some extra cost. However, the programmers at OpenAI et al are already laying in layers of tweaks to avoid certain topics etc - without addressing the fundamental problems like inability to provide references.