Reddit doesn't care about user-generated content being used for "AI" training. It only cares that that it isn't being paid for it.
Reddit sues Anthropic for scraping content into the maw of its eternally ravenous AI
Reddit, the popular internet discussion forum, sued Anthropic on Wednesday, alleging that the AI biz scraped content generated by its users in violation of contractual terms and technical barriers. The complaint [PDF], filed in San Francisco Superior Court on Wednesday, claims Anthropic's use of scraper bots to collect Reddit …
COMMENTS
-
-
-
Thursday 5th June 2025 11:46 GMT Scotech
Flip a coin...
I don't really care which side wins - either would be a net positive for the web in my opinion. In one direction, big
AILLM's get a free for all on low-quality training data, decreasing the utility of their outputs and making it easier to filter the worst of it out, while in the other, they have a potential threat to their business model, raising prices and so helping to choke off the supply of low-effort slop. Legislators seem hell-bent on carving out some sort of IP law exceptions for AI anyway (albeit not to the extent of totally nuking it like Sam Altman apparently wants) so any supposed ToS violation from accessing the pages would likely eventually get handwaved away after a change in the law unless they could demonstrate the data in question wasn't just hoovered up in a drive-by scan of data accessible via the open public Web (as opposed to data that can only be accessed via some form of auth process)The actual IP fight isn't around swill-buckets like Reddit or Twitter, it's for high-quality curated content from genuine professionals in the arts and professions.
-
-
-
Thursday 5th June 2025 12:42 GMT mark l 2
Because of these AI bots just hoovering up all the data on the open web without permission, I can see unless something is done to compensate the websites for their data being sucked into these LLMs more and more websites will start putting their content behind a paywall or at least requiring a login to access it. And explicitly stating that using an AI scraper bot is a violation of their T&Cs.
Which will make using the internet a worse experience for the end users.
-
Thursday 5th June 2025 14:20 GMT ChrisElvidge
compensate the websites for their data being sucked into these LLMs
As usual, poor websites! They forget, it's not their data, it ours!
There was a nice story on NewsThump the other day:
https://newsthump.com/2025/06/02/man-defends-his-use-of-copyright-football-streams-by-insisting-he-needs-them-to-train-his-ai-model/
-
-
Thursday 5th June 2025 16:19 GMT MikeLivingstone
Take Anthropic and openAI to the cleaners
These AI language models have no real value as their outputs are unreliable and reduce productivity.
I believe Anthropic should be found liable to Reddit for at least $10bn, and OpenAI should owe the New York Times at least $50bn for content theft.
The CEOs of LLM firms should also face Bernie Madoff style prison sentences.