back to article Stack Overflow to charge LLM developers for access to its coding content

Stack Overflow has launched an API that will require all AI models trained on its coding question-and-answer content to attribute sources linking back to its posts. And it will cost money to use the site's content. "All products based on models that consume public Stack Overflow data are required to provide attribution back to …

  1. Pascal Monett Silver badge

    "traffic to Stack Overflow has steadily dropped over time"

    And that will continue in inverse relation with just how much Stack Overflow considers itself a money-making tool versus a public utility.

    The more Stack orients itself towards making money and locking down its content, the less people will go to it.

    Tek Tips is a site that has never changed its objective : being useful to the public. It has competent people in every one of its forums, and I have never had a bad experience on that site.

    Stack, on the other hand, has form in restricting its content unless you pay, thereby declaring its basic intent. Stack is not my first choice destination to solve a problem I might have.

    1. Yet Another Anonymous coward Silver badge

      Re: "traffic to Stack Overflow has steadily dropped over time"

      So stackoverflow starts to find ways to hide public generated content behind a paywall. Only question is his you make a suggestively named URL from stack overflow

    2. captain veg Silver badge

      Re: "traffic to Stack Overflow has steadily dropped over time"

      > The more Stack orients itself towards making money and locking down its content, the less people will go to it.

      I miss usenet.

      -A.

      1. druck Silver badge

        Re: "traffic to Stack Overflow has steadily dropped over time"

        Usenet misses you - it is still there.

        1. captain veg Silver badge

          Re: "traffic to Stack Overflow has steadily dropped over time"

          Yes. I expected that response.

          I'm ashamed to say that I don't know how to access it. In days of yore I would simply connect to my ISP's news server. They don't have one any more. What to do?

          -A.

    3. DS999 Silver badge

      Re: "traffic to Stack Overflow has steadily dropped over time"

      I've never heard of Tek Tips, I suspect because stackoverflow is paying to show up in searches and Tek Tips is not.

    4. JLV

      Re: "traffic to Stack Overflow has steadily dropped over time"

      Just had a look at Tek Tips:

      - low volume of questions

      - 90s look and feel

      - minimal formatting, no markdown

      - and...(drumroll)

      replete with very intrusive ads (I especially liked the undismissable modal that covered all the contents of a question).

      Good try at promoting it.

  2. captain veg Silver badge

    user generated content

    "Stack Overflow, however, believes it should be compensated for its content"

    Whose content?

    -A.

    1. Anonymous Coward
      Anonymous Coward

      Came here to say the same thing

      Their site was already monetized, and I'm not hearing of an opt in or copyright checks for the actual people who contributed that work. This is also about more than code, as they have already been mined for tech support chat bots not just code completion. So it seems a lesser crime than OpenAi or Meta scraping it for free, but the money and the control is still in the wrong hands.

      But this is what will always happen when you trust was handled as a public community space to a private company where the public has no stake. If it's the town hall or some new "digital commons" it matters not. If the public doesn't own it, it will be taken from them and sold as soon as it is seen to have value.

      The Stack X community exists as it does due to the community of users, the company is just a steward. If they screw up enough the community will move along, or build something better(which they are uniquely positioned to do), but a better outcome is to give them a fair seat at the table.

    2. Charlie Clark Silver badge

      Re: user generated content

      Like most "network" platforms, the T&C's assign copyright to the platform and you agree to this when you sign up. Personally, given the nature of the content, which rarely contains any real IP, I'm happy with this as I've definitely profited from the help other people have given via the platform Stack provided. The IP is probably a lot less than people provide in their open source projects.

      I know there's a lot of crap on it, but to my mind it's the best example of a crowd-sourced expert system out there. Not only is it content-driven, it's content-focussed and it doesn't doom-scroll or try and offer me stuff I've no interest in.

    3. Lockwood

      Re: user generated content

      Question closed as duplicate

  3. T. F. M. Reader

    Discriminant?

    Does anyone know how Stack Overflow plan to distinguish an AI engine (e.g., Google's) slurping the content into its training set from a search engine (e.g., Google) crawling the same content to index it. Is there a technical way to discriminate between those? Or will it be an honours system?

    Absolutely serious question.

    1. This post has been deleted by its author

    2. Charlie Clark Silver badge

      Re: Discriminant?

      There are now pretty good techniques for discovering whether a model has been trained on content and thus likely to breaching copyright. However, I think having a proper API will be sufficient appeal for many as it means they don't need to write or manage scrapers.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like