back to article Here's a list of thousands of artists Midjourney's AI is ripping off, creatives claim

A spreadsheet submitted as evidence in a copyright lawsuit against Midjourney allegedly lists thousands of artists whose images the startup's AI picture generator "can successfully mimic or imitate." The spreadsheet is part of an ongoing case that argues Midjourney unlawfully profits from creators' intellectual property by …

  1. Pete 2 Silver badge

    Piles of styles

    I wonder how many artists could define their own style? How they would differentiate it from another artists, close but separate, style.

    Looking at a Jackson Pollock for example - or even a few of them - is there any aspect in his work that is common across his portfolio, or unique,, so could be called a style?

    Plus, when you look at something like a Rothko there may be a style to his work, but is it trivial or is it genius (and is the genius getting $60million for it?)

    Which brings us on to the issue of influence. All artists are influenced by others - they readily admit it. Where does that stand in relation to AIs being "influenced" or trained.

    1. Felonmarmer

      Re: Piles of styles

      Artists certainly develop their own style over time, but I doubt there are many that start that way - they learn by copying (being influenced) by others.

      The copyright argument seems to hinge on one aspect, that what AI training does is different from when people do it in terms of legality. It's something that doesn't really get covered in current legislation, but that argument has already be widened in cases to include showing people where to download copyrighted material (Pirate Bay) and the use of particular software that uploads as it downloads (torrents), so it could potentially be widened to include AI training.

      I think something like the way musicians get paid for radio broadcasts of their works might be the way it goes for AI use of copyright material. It's that or leave AI to countries where they don't care about copyright.

      1. agurney

        Re: Piles of styles

        NO, artists don't learn by copying (at least in my experience as a '70s Art school graduate).

        They may be "influenced" by their tutors, but usually in an "up yours" type of response rather than replication :)

    2. A Non e-mouse Silver badge
      Meh

      Re: Piles of styles

      The ChatGPT type AIs that can regurgitate large pieces of existing works are clearly guilty of copyright infringement.

      But creating something "in the style of": That's a much greyer area.

      I predict the only winners will, unfortunately, be the lawyers.

      1. Shuki26

        Re: Piles of styles

        Certainly, music artists get royalties for even short samplings of their music so perhaps AI companies should also pay artists when someone wants a reproduction in some specific style.

        1. Doctor Syntax Silver badge

          Re: Piles of styles

          But-but-but --- that would be money

      2. mpi

        Re: Piles of styles

        > But creating something "in the style of": That's a much greyer area.

        What about this is a "grey area" if I may ask? There is no such thing as a copyright on styles.

        1. MrDamage

          Re: Piles of styles

          >> "There is no such thing as a copyright on styles."

          Rounded corners?

          1. Catkin Silver badge

            Re: Piles of styles

            That's a (grossly shaky) patent.

    3. Doctor Syntax Silver badge

      Re: Piles of styles

      "I wonder how many artists could define their own style?"

      Why should they be able to? The answer would be along the lines of Louis Armstong's definition of jazz. I doubt that even the most successful human forgers would actually define in much detail the style of those they imitate, they just paint like them.

      1. Pete 2 Silver badge

        Re: Piles of styles

        > "I wonder how many artists could define their own style?"

        > Why should they be able to?

        ISTM that if artists are going to argue "this AI produced something inf my style, so I want to be paid" then the defence lawyers will pounce on that and require them to define what their style is/was. This is a different situation from music copyright / sampling which takes part of an original work and inserts it into another piece. I don't think there are any rules that say a musician cannot sing in the style of (e.g.) Taylor Swift. Just that if they use songs she sang, then the copyright holders will want their slice of pie.

      2. katrinab Silver badge
        Boffin

        Re: Piles of styles

        Someone who is an expert in art, ie not me, would definitely be able to explain in great detail what an artists "style" is, and the techniques they use to achieve it.

        1. LybsterRoy Silver badge

          Re: Piles of styles

          Ah yes. I remember (badly) a few instances when "experts" rhapsodised about a piece of artwork until it was discovered to be a fake by some scientific technique or confession by the forger.

          1. Dinanziame Silver badge
            Devil

            Re: Piles of styles

            I think the point is that it is possible to recognize a style, whether it is real or a good copy, and that users apparently request for specific styles. Apparently style cannot be copyrighted, but it is an interesting point that an artist can create innovative works in a style that nobody has seen before, and similar works in the same style can flood the market immediately. At this point, there soon won't be a lot of artists who can make any money at all.

    4. HuBo Silver badge
      Thumb Up

      Re: Piles of styles

      The court documents [PDF] (linked in the article) do an outstanding job of documenting the specific styles of several of the artists involved in the lawsuit, in Exhibit A (p.96+), and Exhibit B (p.219+). This is followed by the Copyright registrations of the artists in Exhibit C (p.236+). After that, Exhibits G (p.301+), H (p.323+), and I (p.341+). show examples of the artists' work, and images produced from them by Stability, Runway, and Midjourney, respectively. These Exhibits demonstrate that the "AI"-produced images are definite mimics of the specific styles of each specific artist.

      The court documents highlight two further aspects of interest IMHO. COUNT THREE (p.59) alleges DMCA violations whereby the CMI of training images were removed or altered, and hence not present in the software's output products, which renders attribution impossible. And, as hinted by Chris in the recent Kettle, there is a question of whether what is referred-to as "training" in this type of genAI is actually anything other than lossy compression (followed by somewhat randomized recall). The court documents make this point quite well on page 31 by citing recent work by Yu et al. (2023): White-Box Transformers via Sparse Rate Reduction: Compression Is All There Is?. Their conclusion (cited also in the court docs) is sobering: "for these existing large AI models, however magical and mysterious they might appear to be: Compression is all there is".

      The evidence presented is extensive, and it looks to me like an open-and-shut case, that these artists should be compensated for their work, without which the AIs couldn't do much at all, diddly, squat.

      1. theOtherJT Silver badge

        Re: Piles of styles

        > the "AI"-produced images are definite mimics of the specific styles of each specific artist.

        I don't think that fact is in question. The question is "Is that illegal"? because it's far from clear to me that it is.

        My gut reaction is that it feels fair to me like the artists should be compensated in some way for their work being used like this, but I'm not convinced that a copyright lawsuit is going to get them that because this doesn't look like a copyright case. This is something new that we just don't have law for yet.

    5. LybsterRoy Silver badge

      Re: Piles of styles

      It may just be personal taste but since when did "how can we rip the public off with this pile of crap" become a style?

    6. Michael Wojcik Silver badge

      Re: Piles of styles

      There's a few thousand years' worth of European and European-derived speculation in a little philosophical field called "aesthetics" on this question, you know. And similar but different speculation in other cultures. Possibly too much to summarize in a post.

  2. cosymart
    Facepalm

    Don't put it on the Internet

    If you don't want your work copied then don't post it on the internet or allow anyone else to do so. Simple :-(

    1. Anonymous Coward
      Anonymous Coward

      Re: Don't put it on the Internet

      That's not how Copyright works.

    2. Doctor Syntax Silver badge

      Re: Don't put it on the Internet

      You mean just like authors shouldn't publish books because somebody might copy them?

    3. katrinab Silver badge
      Megaphone

      Re: Don't put it on the Internet

      They mostly put them on stock image libraries, with the expectation that they would get royalty income everytime someone used it.

      Not that an AI would scrape the sample thumbnail and use that in its training set.

    4. Long John Silver
      Pirate

      Re: Don't put it on the Internet

      Indeed, but realisation that everything representable in digital format can be stored, and be distributed at negligible expense, is slow to penetrate the skulls of people inured by 'rentier economics'. They are rendered incapable of recognising the digital era in which we now live as offering immense opportunities, these far outweighing perceived threats to anachronistic ways of thinking.

      This era leads to, and ultimately enforces, recognition that ideas, their representation, and their uses, cannot be corralled within confines dictated by artificial 'rights' based upon a concept of ownership applicable solely to physical entities, and enshrined within an ersatz monopoly defined by laws. This requires the so-called 'creative' to engage with recipients of their materials in a differing manner to that at present.

      The language of 'rights' implies those seeking to view, copy, share, use, or derive from, a copyrighted work (similarly for patents) are in the position of supplicants seeking favour from their 'master'. 'Favour' is dispensed for a sum of money: the magnitude of the sum is wholly determined by the 'master'; the matter is almost entirely detached from conventional market-economics, competition, and 'price discovery'.

      For example, a book is published. The contents of books represent their authors' desires to inform, to offer insights, or to amuse. An author may feel internally 'driven' to write. Preparation of a work may entail direct expense and/or opportunity cost in terms of time expended. None of the foregoing determine cultural worth for the book; that is a matter for readers, individually and collectively, to determine. Supplicants must pay an upfront fee in order to view the work. These readers incur their own opportunity cost from time spent perusing the book. If a reader deems a work unworthy of the cost of purchasing entitlement to access it and of the time spent reading it, there will be no refund of money paid. Purchase involves an act of faith by the buyer. Faith is strengthened should the reader have read previous works by an author, but that offers no guarantee.

      The asymmetrical relationship between author and reader, this enforced by copyright law, bolsters an arrogance in authors founded upon the belief that anything they publish has cultural worth and some, defined by them, concomitant monetary worth. In fact, the proper relationship should be of an author as supplicant to potential readers. That dependency vitiates the notion of 'rights', and the ensuing rentier economics.

      Generalising across the board to all would-be cultural contributions leads to 'rights' based funding's replacement by voluntary contributions from patrons, i.e. from individuals, collections of individuals as when crowdfunding the next work, from charitable funds, and from government support of the arts, sciences, and of speculative ambitions. The effect is profound. Need for a huge raft of middlemen ceases. Contributors to the technicalities of production charge a fixed fee, or take a share of donations to the creative individual (or group). As a result, a greater portion of individuals' disposable incomes is available for funding activities by others. The truly creative can draw a good income, from which may be made provision for old age: no longer would royalty rental figure in the minds of creators and those of genuinely helpful middlemen. Cultural renaissance would follow from a free-for-all attitude towards 'derivation'.

      Creators do need some protections. These are twofold. First a move towards entitlement to attribution, this already strongly evident in academic works. Second, explicit protection in civil and criminal law against people misappropriating the reputation of another in order to filch income from patronage.

  3. A Non e-mouse Silver badge

    This whole AI/Copyright saga reminds me of the early days of Napster.

  4. theOtherJT Silver badge

    I don't think copyright law can handle this...

    ...it's not really a copyright issue.

    If the AI has been trained on a huge dataset of publicly available images - even if those images are under copyright - and then it produces an image in the style of, but not the same as, any one of those existing images I don't see how anyone can claim that copyright has been breached.

    I'm not a great artist to be sure, but If I were to sit and practice painting in the style of some particular artist for months and years, to the point where I could reliably mimic their style I would be totally within my rights to do so unless at the end of it I were to copy one of their specific works and try and pass it off as actually being by them. Basically, committing forgery. The argument seems to be that what AI training does is fundamentally different to that process, but I really don't see how it is.

    If they're trying to argue that the very act of "downloading" the images so that the AI could be trained on them violates the copyright, that's troublesome. We have to perform the exact same "download" so that we can display the images in our browsers. If someone wants to put in "For the purpose of..." type language somewhere and explicitly exclude AI training from the list of reasons you're permitted to download an image that's not copyright law either, that's wandering into digital licencing territory, and I'm far from convinced that such a licence would be enforceable.

    Copyright might come into play if you're dealing with specific details, for example the much publicised case of Micky Mouse. (Now out of copyright, at least in some forms.) I could release a cartoon of some character that's in copyright and potentially get sued, but I've seen nothing in current law that says a style can be subject to copyright - although of course I'm not a lawyer, so I'm open to correction on that if anyone knows differently.

    1. katrinab Silver badge
      Megaphone

      Re: I don't think copyright law can handle this...

      The first thing you need to know about Artificial Intelligence is that it doesn't exist.

      The next thing you need to know is that as these things are not intelligent, "training" doesn't mean what it means when you train an intelligent being.

      What actually happens is that they take a vast amount of data, perform some statistical analysis on it, and produce an output based on these statistics.

      The training data is the source code. The training model is the binary. The training algorithm is the compiler.

      If you take the source code of Adobe Photoshop and compile it without their permission, that is illegal whether you use the same compiler that Adobe uses, or a different one. And the resulting binary is illegal to distribute even if it looks completely different to the one Adobe sells.

      Same applies if the source code is photos you copied from the Getty Image Library without their permission.

      1. Catkin Silver badge

        Re: I don't think copyright law can handle this...

        >The next thing you need to know is that as these things are not intelligent, "training" doesn't mean what it means when you train an intelligent being.

        What process do you believe is taking place when an intelligent being is trained?

      2. theOtherJT Silver badge

        Re: I don't think copyright law can handle this...

        I really don't think that's an accurate comparison.

        If you had - legally - access to the source code (because they do have, legally, access to the images) and then you re-compiled that code as-was you'd rightly get sued for copyright infringement if you tried to distribute it.

        But if you had access to the source code, and the source code of thousands of other products, and then produced a new product using snippets of code pulled from each of those sets of source you'd have... well, basically what every programmer has been doing sort of forever.

        It's worth pointing out that people have tried to sue for "That specific function looks exactly like this specific function in our code" in the past, and that has been far from universally successful. Sometimes it's worked out based on "No, really tho, even the typos and the comments are the same, you totally lifted that from us" but even in those cases that's been because the party in question wasn't supposed to have access to the source in the first place.

        Once the source is out there to be viewed by anyone we're not talking copyright any more. We're talking about the enforcement of some sort of digital licence like the GPL or others that say what you may or may not do with that code, and that's a totally different issue - and this is my point.

        This isn't a question of copyright, and applying copyright law to it I just don't think is going to work.

        1. Doctor Syntax Silver badge

          Re: I don't think copyright law can handle this...

          "because they do have, legally, access to the images"

          What T&Cs apply to that access. You may be legally allowed to view the image nd nothing else. You may not be legally allowed to copy and paste into some other work. You may not be legally allowed to scrape it into a ML training set.

          1. Anonymous Coward
            Anonymous Coward

            Re: I don't think copyright law can handle this...

            Force me to read your T&C and click to accept them BEFORE allowing me access to your site and you may have something. Otherwise....

          2. theOtherJT Silver badge

            Re: I don't think copyright law can handle this...

            A valid question. But again, as I keep saying, this makes it a matter of licencing law. Not copyright law. I don't recall seeing anything that says "You may not use any of the content herein for the training of LLM or similar systems." on any website I've visited recently - and frankly even if it did say that it's still not really been determined by any court - as far as I know - that sticking a notice up with a list of restrictions and a "click here if you don't accept" is actually legally binding.

            1. klh

              Re: I don't think copyright law can handle this...

              I know the comment is old. But licenses are based on copyright. And the default is you have no right to do anything, then the license grants you some more or less limited rights.

              If you are not allowed to resell or give away the work or derivative works it doesn't have to specify exactly that you can't train an LLM using it.

  5. mpi

    Disclaimer:

    I am not a lawyer. So the following is just my opinion.

    "sued by artists claiming these machine-learning houses lifted copyrighted images to train models, and made those models available so netizens can produce infringing works on demand, without permission and without recompense. The creatives allege their rights were trampled, that the software can be used to flood the market with knock-off work to their detriment, and they want damages from and other measures levied against the startups."

    Alright, let's go through this:

    - "lifted copyrighted images" ... what exactly does accessing images available on the open internet have to do with "lifting", or "copyright" for that matter?

    - "to train models!" ... Question, what does the copyright status of these images have to do with training models on them? Is training copyright infringement? Which court or law says so? Very relevant link on the topic..

    - "can produce infringing works on demand" ... What's an infringing work in that context if I may ask? One that emulates the style of an artist? Styles are not copyrightable for good reason. One that mimicks a specific work of an artist? Photocopiers can do that as well. As can cameras. And pencils, for that matter.

    - "allege their rights were trampled" ... what rights specifically?

    - "that the software can be used to flood the market with knock-off work to their detriment," ... copyright law doesn't grant protection from market forces or technological innovation, so what exactly is the complaint here?

    1. katrinab Silver badge

      Re: Disclaimer:

      > what exactly does accessing images available on the open internet have to do with "lifting", or "copyright" for that matter?

      Getting the image from some other internet server to your server involves making a copy.

      1. mpi

        Re: Disclaimer:

        I think we are long past the question whether making a copy for the purposes of processing it in a computer system violates copyright or not.

        1. Michael Wojcik Silver badge

          Re: Disclaimer:

          And indeed that is in no way the issue in this case, and no one involved in the case is claiming otherwise.

          The plaintiffs contend that training included copyrighted work, not that making copies of that copyrighted work for purposes of processing in itself violated copyright.

    2. Mark C 2

      Re: Disclaimer:

      Yep, you are not a lawyer and you also do not understand the very basics of Copyrght. Try searching the internet for a definition and then you can ask informed questions. Here is an example"

      "the right which the law affords for protecting the produce of man's intellectual industry from being made use of by others without adequate recompense to him"

      1. mpi

        Re: Disclaimer:

        And what specifically about this example refutes what I write, or answers any of my questions?

    3. Doctor Syntax Silver badge

      Re: Disclaimer:

      I looked at your "very relevant link".

      It's basically argumentative, not authoritative, in that it's almost the sort of thing that a defence lawyer might argue before a judge. Almost, because it would normally be supported by citations from cases that provide precedent. It would also be subject to arguments in rebuttal by the other side. In the end it would only carry weight if a judge agreed with it - and that would include falling for the notion of equating a record player with an ML training set. And outside the US it would fail at the words "First amendment".

  6. breakfast

    If it were me, I'd be furious

    If I found my work had been used to train a statistical model by people whose specific intent -and let us not pretend they have any other- is to cause me to lose income, I think it would be quite fair that I be angry about that.

    The problem of Copyright is in part that it's not exactly designed for this, but how could it be? This type of tool hasn't existed for long enough to undergo serious legal evaluation and even the people training the models don't seem to have a clear understanding of how it works, so we end up in something of a square peg round hole scenario.

    In this kind of situation, I think a fair law would lean towards protecting the people creating art. It is one of the most human activities, one of the few things we have records of going back to the very dawn of our existence. The idea that this should be dispensed with, replaced by statistical averages ... isn't there something wrong with that? Isn't it a little sickening that companies like this believe that hurting artists is a business model worthy of pursuit, let alone the investors and sycophants who support them?

    The other thing I don't really get is where the profit comes from. I know a good number of artists and I wouldn't say any of them are exactly rolling in it. Mostly they're living hand to mouth and often having to take up other work because the art doesn't pay the rent. If Midjourney or whoever else manages to take every bit of income that they have, they're still not going to make very much money compared with most other industries.

  7. Matthew "The Worst Writer on the Internet" Saroff

    AI Skeptic Here

    I think that this crap is just an overengineered ELIZA program.

    That being said, if machine learning is learning, and to be clear I do NOT think that it is learning, then teaching an AI should be an acceptable use of the material.

    Much like wannabee standup comics will teach themselves to by reading the joke books of current comments, and watching their performances on YouTube, the AI is teaching itself, which is the purpose of these works.

    How is, "Teaching an AI is a violation of copyright," when, "Teaching a person," is not?

    Again, if, as I believe, this is not intelligence in any form, and just a mixmaster of gibberish, then the issue is not IP infringement, but rather that the AI companies are defrauding people.

  8. 96percentchimp

    Copyright protects the right to derive a return on the intentional work expended to produce art

    In copyright law, units of artistic production are known as "works" because that's precisely what is required to produce them: the intentional work of one or more conscious entities, including time, physical and mental effort, and resources, both physical and financial. What copyright protects is the right to derive a return on the work expended to produce art, and to assert ownership of that work because it represents your effort and intent.

    Works of art also cannot be unintentional, which is why a monkey cannot own the copyright in a photograph that it produced without knowing the outcome of its actions, but the photographer can own the copyright because they engineered the conditions in which the monkey was able to take the photograph. This is not to say that a non-human primate (or a conscious AI) couldn't produce an intentional artwork, but that's an argument for another day.

    Jackson Pollock's works are art because he is not merely spilling or flicking paint at random. He has spent time, effort and resources in developing a technique which creates an effect, and rejects works which don't reflect his intention. It's irrelevant whether you or I like them.

    AI advocates often point to inspiration as a form of copying, and argue that statistical AI tools are no different, but the concept of work negates this argument. When another artist is inspired to create a work of art in the style of Pollock, they must still expend time, physical effort and resources to learn similar techniques and achieve their desired effect, even if their work is less original.

    Midjourney, ChatGPT et al are not conscious and cannot intentionally produce art, no matter how much work they expend. What they represent is a shortcut to the work by those who have the intent but lack either the time or resources to create it. Works created through the significant use of AI tools to imitate the style of other artists should not benefit from the protection of copyright, and the creators of those tools should compensate those artists from whose intentional work they are deriving a benefit, financial or otherwise.

    There might be an argument that AI tools expend effort in the form of processing power, which is passed on to the users of the tools. However, this cost is (i) not paid by the person using the tool because it’s free; (ii) it is negligible because the cost of AI tools is insignificant in comparison to the aggregate effort of the original artists on which the AI tools draw; and (iii) it is divorced from the conscious intent of the person using the tool.

    1. Flamflam28

      Re: Copyright protects the right to derive a return on the intentional work expended to produce art

      This is pretty much the top and bottom of it.

  9. Flamflam28

    The fall of generative AI in the arts is approaching crypto and NFT levels of farce. Think there's only one way momentum takes this now. How soon will folk start abandoning ship en mass?

  10. Cezme

    Banksy deserves half of any settlement

    Because he has earned so much more than the common artist has. This is called scarcity-based thinking which advanced AI can end, the mem: https://www.genolve.com/design/socialmedia/memes/writers-artists-want-compensation-for-training-the-LLM

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like