back to article GitHub, Microsoft, OpenAI fail to wriggle out of Copilot copyright lawsuit

The judge overseeing the lawsuit challenging the legality of GitHub Copilot, and its underlying OpenAI Codex model, "borrowing" people's code samples has refused to dismiss two claims in the case and sent most of the other allegations back for revision. The order [PDF] issued by US District Judge Jon Tigar in Northern …

  1. Will Godfrey Silver badge
    Unhappy

    More popcorn please

    This will run and run. My only worry is that Microsoft will probably try to keep the case alive in order to run their opponents out of money.

    1. ITMA Silver badge
      Devil

      Re: More popcorn please

      IBM's favourite tactic especially with US government antitrust cases.

  2. JoeCool Bronze badge

    Seems like the rulings are competent and well reasoned.

    This in particular : "GitHub and Microsoft do not explain why the rise of internet trolls renders Plaintiffs’ fears of harm unreasonable."

    1. Zippy´s Sausage Factory

      Re: Seems like the rulings are competent and well reasoned.

      I couldn't work out that defence myself. Threatening behaviour is threatening behaviour, and just because there's more of it doesn't render it socially or legally acceptable. If anything, it makes me wonder the competency of their legal team (or the lack thereof, I should say).

      1. Doctor Syntax Silver badge

        Re: Seems like the rulings are competent and well reasoned.

        "it makes me wonder the competency of their legal team"

        It shows their legal team is on the ball. Defence will try any possible avenue hoping to get lucky with some of their attempts however implausible.

  3. Pascal Monett Silver badge
    Trollface

    "We’ve been committed to innovating responsibly with Copilot from the start"

    Yes indeed. Blatantly stealing code from your own repository is quite the innovation.

  4. Bryan W

    Deja Vu

    This feels a tiny bit like that one time some company had a team of engineers analyse and record the technical features of a software product, hand those notes over to another team and then have that team create a copy of the product and claim that it wasn't a copy.

    1. jilocasin
      Boffin

      Re: Deja Vu

      That would be a clean room re-implementation and is commonly done in software circles. It's how a competitor would make an interoperable product without documentation.

      This is more like one team copied the source code and then handed it to the other team verbatim. The second team then incorporated the original source code, as is, into their product and is claiming that it is now a completely original work.

      Those two situations aren't even close.

    2. JoeCool Bronze badge

      Not at all

      This is a developer reading the source from a screen on her left, and typing it in to a screen on her right. But without the copyright notices, and while ignoring the liscencing terms.

      1. matjaggard

        Re: Not at all

        It's not that simple as I hope you're aware. This is somewhere between blatant copying and someone using open source to learn programming.

        1. JoeCool Bronze badge

          Re: Not at all

          My statement captures the crux of the issue, as related to the post I'm responding to, and as I understand the article.

          If you think there is a significant subtlety that I have not captured, please elucidate.

          If I follow your thinking, can you explain WHAT it is that CoPilot is "learning".

          Copyright is not about the algorithm or concept, it is about the expression, the physical representation; the code itself.

          And wholesale copying is illegal. Moreso if the copyright notice is also removed.

          1. John69

            Re: Not at all

            What CoPilot is learning, and this is the same for all LLMs, is what the most likely next word is given the preceding words. How exactly the output relates to the input in a legal sense is something the courts will decide. How similar that process is to human learning is something we shall all have to figure out.

  5. Michael Wojcik Silver badge

    Happier developers?

    You know what would make me happy? If the jackasses running GitHub would stop trying to tell me what I want.

  6. martinusher Silver badge

    Most code is copied, anyway

    Its about time we all stopped pretending that we are in the business of creating original code, of crafting from the very fabric of the universe our totally original creations etc. etc.

    If we were truly honest we'd know that the only reason why we claim code is totally original is either because of hubris or ignorance. The vast majority of code is assembled from building blocks made from existing code. Buried in the code there might be an original algorithm or two or something else truly original but regrettably the vast bulk is just recycled. (...and, yes, as a matter of fact I have been programming for a very long time, long enough to have quite likely made some original code, but you'll never find me going around claiming that my code had been 'stolen'.)

    This rivals in sordidness with the whole software patent fisco where countless programmers claimed that they'd invented 'the wheel' -- and lots of non-technical types encouraged them, sensing a payout.

    1. Nifty Silver badge

      Re: Most code is copied, anyway

      The English Language ©

    2. Anonymous Coward
      Anonymous Coward

      Re: Most code is copied, anyway

      By that logic, we should do away with software licensing entirely - if the code is sufficiently unoriginal to not be protected by copyright, why worry about paying for a license?

      If code is copyrightable then, by definition, developers should have the right to object to this sort of use, and all the more so when it occasionally spits stuff out verbatim.

    3. matjaggard

      Re: Most code is copied, anyway

      This is nonsense. Most code is original, even the move-data type projects have quite a bit of original data mapping and the like. The most commonly used data structures are provided by the languages we use. If you're finding yourself rewriting code that already exists then maybe you should be using a library.

    4. doublelayer Silver badge

      Re: Most code is copied, anyway

      There's a very big difference between patents, where you have to demonstrate that what the code is doing hasn't been done before and is substantially new* and copyright, where you just have to show that the code was written by you and wasn't copied. A lot of the constructs in code that we write aren't completely unimaginable, and thus cannot be patented, but it is still written by us and copyright applies. For the same reason, there are a lot of books written these days and in any other days you'd care to mention that show little or no imagination from the author. Even for those where there is some new stuff, pieces of plots, settings, and characters will be similar to things that have already been produced. Those books are still copyrightable. It doesn't matter that they're built from the same set of English words, just as it doesn't matter that most code is built from the same language constructs as everything else. The originality is in the order and structure of those statements, not that every component has been invented from scratch.

      * You have to demonstrate inventiveness, or at least you're supposed to and somebody has the job of verifying that you did. They're not always great at doing that job correctly, but the legal requirement is still there. I think software patents should still be possible, but we likely agree that most of the ones that exist should never have been granted.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like