back to article Perplexity AI decries News Corp's 'simply false' data scraping claims

Artificial intelligence startup Perplexity AI has hit back at a lawsuit claiming that it's unfairly harvesting data from Dow Jones & Co and the New York Post to feed its AI engine, as well as stealing and mangling content. At the start of the week, News Corp took legal action on behalf of its publications, claiming in court …

  1. MOH

    If I rewrote this and every other Reg article with mild paraphrasing along the lines of:

    "AI newcomers Perplexity have responded to a legal claim that they're stealing content from..."

    and then based my business on publishing that content, I'd (rightly) get sued.

    Mind you they should be sued just for the sickening tone of that blog post alone

  2. IGotOut Silver badge

    WTF

    "where publicly reported facts are owned by corporations'"

    Well, it's the corporations that made those facts public for you to lift.

    Next argument.

    "Publicly visible art is owned by the artists", so we'll steal that content as well.

    Now for a shower after defending News Corp.

    1. Snowy Silver badge
      Pint

      Re: WTF

      Have some of these ---->

      It may help you forget you defended them.

    2. Pete 2 Silver badge

      Re: WTF

      > Well, it's the corporations that made those facts public for you to lift.

      So should all those corporations sue Wikipedia (among others) for lifting that same information, rephrasing it, and putting it on their own websites?

      The difference between the two is that one automates the process used by the other. However, the result is the same: content originated on one site appears elsewhere without a sharing agreement in place.

      1. DJO Silver badge

        Re: WTF

        Look at the bottom of a Wikipedia article, there will be list of linked citations. They always (where possible) cite sources.

    3. graemep

      Re: WTF

      "Well, it's the corporations that made those facts public for you to lift."

      Does not matter in terms of copyright law. Facts are not covered by copyright.

      Can you imagine what news coverage would be like if the first to report something could precent others from reporting on it?

      It might also often the case that media corporations disseminate the facts, but they rarely make them public. If you find out about a announcement by your government through News Corp, it is still the government that made the information pubic, not News Corp.

    4. Version 1.0 Silver badge

      Re: WTF

      Just because you do not take an interest in AI doesn't mean AI won't take an interest in you.

    5. Zippy´s Sausage Factory

      Re: WTF

      We do not defend NewsCorp because we like them, we defend NewsCorp because we must. Even if it gives us the ick.

  3. 45RPM Silver badge

    In building an AI, one of the biggest concerns has to be the cleanliness of the initial dataset. I’d argue that this alone makes scraping the broader internet a very foolish thing to do if the object is to build a tool which is useful and expert in its field. That said, at least the internet as a whole has a broad range of views that one can draw on - so at least it will be equally awful on a wide range of views and opinions.

    Training any dataset on a Murdoch organ like the New York Post though? I mean, fine if you want to build a uniquely fascist AI with objectionably racist views. Like building an AI out of the Daily Fail or the Maily Telegraph. I wouldn’t even want to build an AI out of the Guardian - although I suspect in that case the worst thing that could be said about it is that it had a perplexing predilection for home made muesli.

    1. PB90210 Bronze badge
      Trollface

      The problem with training AI on the Grauniad is the daily corrections column... although that could explain some of the hallucinations!

  4. xyz123 Silver badge

    news Corp are the scum (via the news of the world) who left fake voicemails on a murdered teenagers voicemail to trick the parents into thinking she was still alive.

    So they could basically run shitty evil monstrous front page "she called home" type stories.

    Everyone at news corp should be banned from pretending to be a journalist for life.

    1. DarkwavePunk

      Unfortunately that doesn't preclude them from having a point in this particular case. I'm going to join the above commentards in a beer shower...

      1. veti Silver badge

        It doesn't, but they don't have a point anyway, and it certainly should preclude them from getting any sympathy.

        Copyright gives you the right to forbid - a very specific set of actions. Unless News Corp can demonstrate to the court that one of those actions - reproduction, translation, adaptation, performance, distribution - is happening, they've got nothing.

        1. doublelayer Silver badge

          So their original point, that their articles were being reproduced wholesale until the hallucinations kicked in and ascribed completely invented conclusions to them, was something you didn't notice? They're alleging violations based on that, not covering the same facts they covered. Perplexity has tried to hide this by suggesting that they were claiming to own the facts, which they weren't, not the articles, which they do.

          And all of this is independent of whether I like the creators of the articles in question. In fact, if I hate the articles, I have another reason not to want this to happen. In addition to being a copyright violation, it means the AI will be parroting the content of the reprehensible articles.

  5. mark l 2 Silver badge

    "The Perplexity crew points out that it already has content-sharing deals in place with Time, Fortune, and Der Spiegel, and would have been happy to work with News Corp on a similar deal. "

    How many millions of other peoples content have they harvested without having a content sharing deal in place though? Have El Reg checked to see if Perplexity will regurgitate The Register articles with the right prompts?

    1. veti Silver badge

      If anyone can get it to regurgitate articles without crediting the source, they'd have a slam dunk case.

      But they can't. If they could, they'd have done it already.

      News Corp is in the position of complaining that this... thing read its content, and now it has the audacity to talk about it. If it were a human doing the exact same thing, they'd be tickled pink. Seriously, "getting people to talk about your content" is worth major kudos in journalism. That's mostly why these comment pages are here at all.

      But because it's an AI, they smell... Danger? Money? Both, I'm guessing.

      Neither copyright nor any other legal principle gives a publisher the right to say that its content can't be read, and commented on, by anyone or anything that can legally access it in the first place. And News Corp of all people should not be allowed to create such a right.

      1. Andy Tunnah

        Because the AI will read it once and then make it part of THEIR ecosystem. 1 "subscriber" vs many.

  6. cookiecutter

    Stunningly shite but expected from AI MORONS

    It's amazing that people on the AI ponzi scheme refuse to accept that people had to WORK to create the content they're stealing.

    Newscorp, ad much as i hate them, had to pay journalists, buy computer equipment, etc etc to produce those "publicly available facts" on their website.

    An artist has to practice for a lifetime, pay for school, courses, paints, canvases, laptop etc to create the art that ends up on their website.

    A writer has to do the same. A photographer the same.

    This is the unfortunate end game of tech stealing people's work, whether it's content or underpaying deliveroo riders or even scraping academic papers by Google.

    1. ecofeco Silver badge

      Re: Stunningly shite but expected from AI MORONS

      Funny how they are using AI to to steal our leisure and not to free us from drudgery.

  7. ecofeco Silver badge

    Sums up our modern world

    'They prefer to live in a world where publicly reported facts are owned by corporations'

  8. Anonymous Coward
    Anonymous Coward

    You can always rely on two things in a News Corp publication

    The date and the time

    1. PB90210 Bronze badge

      Re: You can always rely on two things in a News Corp publication

      But it's always good to double check before accepting anything they say

  9. hh121

    It's difficult to imagine how I could care less about news Corp, but the book writers, journalists, musicians, actors, painters, photographers etc that I do want to see survive with jobs that can sustain them probably won't, and thats just for starters. This isn't some Luddite rebellion in specific roles, this is existential for vast swathes of the service industry where they're using our content (even this post) to teach something how to replace us. Fun times.

  10. anonymous boring coward Silver badge

    News Corp "reporting facts"? Really?

    Anyway, Perplexity seems to have never heard of copyright?

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like