back to article Open source maintainers are drowning in junk bug reports written by AI

Software vulnerability submissions generated by AI models have ushered in a "new era of slop security reports for open source" – and the devs maintaining these projects wish bug hunters would rely less on results produced by machine learning assistants. Seth Larson, security developer-in-residence at the Python Software …

  1. Dan 55 Silver badge

    "Recently I've noticed an uptick in extremely low-quality, spammy, and LLM-hallucinated security reports to open source projects," he wrote, pointing to similar findings from the Curl project in January. "These reports appear at first glance to be potentially legitimate and thus require time to refute."

    Just like social media. We're drowning under an avalanche of bullshit, each item takes time and energy to refute, but in the time taken to refute it 10 other pieces of nonsense have already gone viral. And the people sharing this nonsense don't really care... if that one turned out to be a lie then this next one must be true because it looks true.

    1. sabroni Silver badge

      re: Just like social media.

      The difference being you can ignore social media with very little down side. Ignoring a bug report could be a lot riskier.

      1. Dan 55 Silver badge

        Re: re: Just like social media.

        Some people can't ignore social media:

        Romanian court annuls first round of presidential election

        Ruling follows revelation of declassified intelligence alleging Russia ran online campaign to promote far-right outsider

        1. RAMChYLD

          Re: re: Just like social media.

          > Some people can't ignore social media:

          Agreed. Bad enough my mom buys up Mainland China-generated BS that is circulated to her WhatsApp and WeChat accounts regularly, but then tries to force it onto other members of the family.

          1. Fred Dibnah

            Re: re: Just like social media.

            Mainland China is the source? How do you know?

            1. imanidiot Silver badge

              Re: re: Just like social media.

              Probably the combination of "WeChat" and the junk saying that China is great and benevolent and that there is no war in ba-sing-se or words to that effect

      2. jake Silver badge

        Re: re: Just like social media.

        "Ignoring a bug report could be a lot riskier."

        Not really. The dangerous bugs don't usually show up as singleton reports ... and the ones that DO show up as singletons are always reported by people who wouldn't use AI if it were mandated by the project in question. Wetware filters are usually sufficient.

        1. sabroni Silver badge

          Re: Wetware filters are usually sufficient.

          But the article says: Open source maintainers are drowning in junk bug reports written by AI

          and quotes the developer "These reports appear at first glance to be potentially legitimate and thus require time to refute."

          Sounds like the time this wetware takes is non-trivial.

          But also "The dangerous bugs don't usually show up as singleton reports...." Usually doesn't mean always.

          1. nobody who matters Silver badge

            Re: Wetware filters are usually sufficient.

            If bug hunters are using LLMs, and there are only a limited number of LLMs available, and they all work in a similar fashion, it then follows that the LLMs will be pointing all the bug hunters using them at the same alleged bug.

            I would expect the developers are recieving multiple reports all pointing at the same potential bug, so not really possible to just dismiss them out-of-hand.

            1. Richard 12 Silver badge
              Mushroom

              Re: Wetware filters are usually sufficient.

              That's not how generative AI works, unfortunately.

              They have a randomising function, called "temperature".

              When the "temperature" is zero, they should theoretically produce the same output for a given input.

              But the free/cheap models have no control over "temperature", and it's never zero. I have no idea why, but it means each run is likely to be different, especially if the input is large.

              The upshot is that every one will be a bit different. Given many runs, there is a reasonable chance that several will point at the same kind of place of course.

              1. teebie

                Re: Wetware filters are usually sufficient.

                " free/cheap models have no control over "temperature", and it's never zero. I have no idea why"

                To encourage confirmation answers. If you run a generative AI 10 times, then its more likely one is accidentally right, and that is the one that lodges in the brain.

                1. UnknownUnknown

                  Re: Wetware filters are usually sufficient.

                  … and 10x the electricity consumption and environment being fucked over!!

                  NVidia must be delighted.

                2. mirachu Bronze badge

                  Re: Wetware filters are usually sufficient.

                  The obvious and probably correct answer is "money". Pay for a subscription, get hopefully better results.

                3. teebie

                  Re: Wetware filters are usually sufficient.

                  Well that was rendered meaningless by a typo.

                  I meant "to encourage confirmation bias"

              2. rg287 Silver badge

                Re: Wetware filters are usually sufficient.

                "But the free/cheap models have no control over "temperature", and it's never zero. I have no idea why"

                Because if people put the same prompt into a chatbot and got - word for word - the same deterministic response, then it'd break the spell that there's some sort of "intelligence" going on. It would immediately look like a more primitive chatbot with a decision tree and pre-written responses that you could reliably loop back to if you asked the right questions in the right order.

                By putting a little bit of fuzz around the edges it seems more human - ask someone the same question that you asked them a month ago and they might use a different turn of phrase, present arguments in a different order or otherwise vary their response. They're trying to fuzz more "natural language" responses out of their otherwise deterministic model.

                Remember, half the people in the world have below-average intelligence. The proportion of people who have any sort of technical understanding how LLMs work is vanishingly small. It's all being sold on "woo" for execs and MBAs to think "this is amazing, lets pump billions into it". "Temperature" is one layer of the curtain hiding the fact that the "wizard" is just a guy with a bunch of levers.

              3. nobody who matters Silver badge

                Re: Wetware filters are usually sufficient.

                <.........."That's not how generative AI works, unfortunately.............................."....>

                Thanks for that explanation :)

                However, if multiple users are running queries multiple times, surely you would expect some duplicate responses (or at least very similar ones) to occur even if the way the system worked were to be completely random? (and I get the impression from the article that they are experiencing multiple reports from multiple 'bug-hunters').

      3. Anonymous Coward
        Anonymous Coward

        Re: re: Just like social media.

        You're correct that you can ignore social media, the problem is that so many people don't and that's being used to shape the lives of everyone, regardless of your usage.

    2. Anonymous Coward
      Anonymous Coward

      Just say NO

      to Social Media then you won't get your knickers in a twist about some trivial matter.

      Social Media == SM == Sado Masochism

      Give it up people and get a life outside the SM world. Look after your own mental health and not the bank balance of so called influencers.

      1. Gene Cash Silver badge

        Re: Just say NO

        Says the AC on social media...

        1. This post has been deleted by its author

        2. Jedit Silver badge
          Joke

          "Says the AC on social media..."

          I would argue that the Reg comment boards are antisocial media.

          1. EricM Silver badge

            Re: "Says the AC on social media..."

            Joke aside, the answers I read here are in most cases much more considerate end empathic than what I typically read on so-called socila media.

            So; I'd classify X, Facebook, "Truth" Socal, etc as the antisocial bunch...

          2. sabroni Silver badge
            Happy

            Re: I would argue that the Reg comment boards are antisocial media.

            Fuck the fuck off!

            1. Zoopy

              Re: I would argue that the Reg comment boards are antisocial media.

              "Fuck the fuck off!"

              I fear that even a large conference of physiologists, linguists, and pornographers would be unable to pin down the meaning of that utterance.

              You, sir, are a true vulgarian.

              1. Ken Hagan Gold badge
                Happy

                Re: I would argue that the Reg comment boards are antisocial media.

                Really? Seems pretty clear to me. Perhaps you are over-thinking this. To coin a phrase: What part of "Fuck the fuck off!" don't you understand?

                1. jake Silver badge

                  Re: I would argue that the Reg comment boards are antisocial media.

                  And just to confuse the issue, according to at least one circle of people I occasionally hang out with the phrase "fuck you, you fuckin' fuck" is a term of endearment.

                  1. skswales

                    Re: I would argue that the Reg comment boards are antisocial media.

                    Ah, Dennis Hopper

      2. cyberdemon Silver badge
        Devil

        Re: Just say NO

        GitHub: Social Coding

        Sounds a lot like Social Media to me. No wonder it is full of crap like this.

        LinkedIn is just the same.

        Incidentally, both platforms are owned by the one entity that might have a vested interest in destroying Open Source. It's the final step of Embrace, Extend, Extinguish

        1. cyberdemon Silver badge

          Re: Just say NO

          Also, I note that GitHub, since their owners also basically own ChatGPT, have access to ChatGPT's database of outputs.

          If the fake bugreports were being generated by ChatGPT, then GitHub could say "This bugreport looks very similar to a recent ChatGPT output. Do this again and be banned" - but they don't do that. Why not?

          1. doublelayer Silver badge

            Re: Just say NO

            Because:

            1. They don't have a database of all GPT outputs. That would take a lot of disk space, and as much as they like storing everything, they aren't actually storing literally everything.

            2. They don't surveil every comment or message sent through GitHub, let alone those submitted by GitHub users on other mechanisms.

            3. If they decided to do both of those things, comparing every issue with every GPT output with a fuzzy match would be really computationally expensive.

            4. That is to say nothing of the privacy implications of all three of those things. I know big tech tends not to care as much about privacy as you or I would like them to, but there are still laws that would affect them if they did that.

            1. FeRDNYC

              Re: Just say NO

              I know big tech tends not to care as much about privacy as you or I would like them to

              Ironically, though, they do still care about PRETENDING they care about privacy, which is what would prevent them from doing anything like that. They reserve breaches of privacy & trust for internal uses, where they can plausibly deny looking at sensitive information that people prefer to assume they don't go poking around in. (And also reserve it for money-making situations like marketing, not actually benefiting society out of the goodness of their hearts. When doing that they become very privacy-conscious, suddenly.)

            2. cyberdemon Silver badge
              Holmes

              Re: Just say NO

              > They don't have a database of all GPT outputs.

              Do you know that for a fact, or is this more of a "surely OpenAI would never do such a thing.."?

              They wouldn't need to store everything, forever - just a few weeks worth. And even then, they may be able to filter the interesting stuff.

              Besides, seeing what sort of queries it gets asked (and by whom) and which outputs subsequently appear on the Internet is a valuable source of training data, and the size of it is miniscule compared to the original training data. There is no way they would NOT do that, and unless you know for a fact that they don't, it would be very silly of you to trust them not to.

              > but there are still laws that would affect them if they did that.

              Er, more like laws affecting them if they -didn't-. In the UK at least, they would be expected to keep records of activity - if someone did something naughty with ChatGPT, or if someone complained about its output, the authorities would want a log of recent activity.

        2. John Brown (no body) Silver badge

          Re: Just say NO

          "Incidentally, both platforms are owned by the one entity that might have a vested interest in destroying Open Source. It's the final step of Embrace, Extend, Extinguish"

          Yeah, that thought crossed my mind too. If MS genuinely want to train their LLMs on bug hunting, I wonder if they used their own, closed, source code in the training data? I bet not.

          1. Tron Silver badge

            Re: Just say NO

            There are good reasons for not using MS's own data for training anything, human or AI, to hunt bugs. As we are reminded every update.

          2. doublelayer Silver badge

            Re: Just say NO

            "If MS genuinely want to train their LLMs on bug hunting, I wonder if they used their own, closed, source code in the training data? I bet not."

            I bet they did, but on a specific bug-finding model which they don't release. They may release another one, trained only on public data because anything trained on might leak out and they don't want their internal code to leak. Neither of them will matter because these junk reports are generated by people using conveniently available tools like GPT and Gemini and Claude which all can write code and find errors in code and all sorts of specialized work as long as you don't mind them mostly being wrong when they do it. Those models weren't built to find bugs, so any ability to do that is one of a few types of luck.

            1. katrinab Silver badge
              Megaphone

              Re: Just say NO

              LLMs are Large *Language* Models

              As a language model, it can write something in the style of a bug report, but actually understanding the code and spotting bugs in it requires a completely different skillset.

              1. doublelayer Silver badge

                Re: Just say NO

                Isn't that what I said? They can simulate the detection of bugs. Sometimes, they can simulate it well enough that they'll actually make up a bug report about a bug that really exists. In order to get there, users are going to get several crap reports out first. Models that are intended to find bugs and trained accordingly can be tailored to the task better and can, with a lot of knowledge about their limits and how they're going to do it, be used by professionals in some cases. That is not the case for the people filing these reports.

          3. CowHorseFrog Silver badge

            Re: Just say NO

            MS doesnt care about detroying OS, their big cashcow is azure. THey dont care what you deploy as long as they get their cut.

    3. hoola Silver badge

      I suppose the question is "who would have thought this would happen?"

      This is simply the beginning and AI generated shite is just going to keep being spewed out as the advocates of AI are incapable of comprehending that it is all UTTER BOLLOX!!!!

      It is just a money-making fad in the same what the all these smart IoT devices were a few years ago. I am sure that in some carefully controlled instances AI can be useful, it is just that 99% of the use will not be that. It will be people using it to generate money in scammy ways.

    4. kirk_augustin@yahoo.com

      Example?

      I can't image how AI could be used to detect bugs much less how they could be wrong? Anyone have an example?

      1. CowHorseFrog Silver badge

        Re: Example?

        There are already lots of tools that scan code for basic mistakes like ignoring the length of an array while looping over the elements. Add the word AI and its 1000xxxx better (supposedly).

      2. mirachu Bronze badge

        Re: Example?

        You can't understand how something inapplicable could be wrong? Are you serious?

    5. steviebuk Silver badge

      True. All people care about now is going viral and hoping they can make money from it. The vast majority of influrences and YouTubers give no shits about what they are peddling, as long as they can make money with it. Its easy money for them than having to work a proper job when bullshit office politics, HR and managers.

      Like one such couple here in the UK that I came across on YouTube, doing their "VLOG". I hate those but took a look, they were talking about their love of camping etc. All fine, nothing wrong with that, then it was "And we have this mobile solar panel. We can use it to charge everything and its amazing blah blah blah. Oh, the local camping store sent it to us for free to "review", we've not actually used it yet". But you've said it was amazing yet haven't actually used it, you've not flagged your video as a paid promotion (if you get given stuff for free, even if not being paid by the company, its still classed as paid advertising as they've given you content for you video). Its just awful these days.

      1. heyrick Silver badge
        Happy

        I'm glad the closest I get to YouTube sponsored content is Big Clive being sent stuff to take apart and ridicule, plus the odd "will whisk(e)y work in a sodastream?".

  2. Bebu sa Ware
    Windows

    A barrel of rotting fish...

    used to conceal the stench of a decaying corpse.

    I would be interested in the stats of the prevalence of this crap v. particular projects.

    If it were it the case that projects that could provide enormous leverage through supply chain attacks or other vulnerabilities, or projects that were intrinsically security sensitive were being unduly targeted I would be deeply suspicious that an organised compaign was being waged by the usual state sponsored suspects and plain unsponsored criminals as a distraction.

    The longer a real vulnerability remains undetected, unreported and unremedied the greater the opportunity for these malefactors to take advantage of the situation.

    1. sabroni Silver badge
      Meh

      Re: the usual state sponsored suspects

      Go on, tell us who's on this list. Who is the state sponsoring to fuck up OSS?

      1. jake Silver badge

        Re: the usual state sponsored suspects

        You're talking to a bot, sabroni.

      2. Scotech

        Re: the usual state sponsored suspects

        The amusing thing here is that most of the usual state-actor suspects when it comes to little acts of cyber-vandalism are probably the least likely to be involved in this kind of thing. Russia? China? Iran? North Korea? All huge fans of open-source. Its not in their interests to sabotage it, not when it provides their cheapest and easiest means of achieving parity with their rivals. As for the idea that western governments could be behind it, surely their efforts would be better spent infiltrating the project, than on stupid stuff like this? It makes no sense.

        No, this smacks more of either people chasing bug bounties for minimal effort, or trying to generate 'activity' to pad out a CV for a tech job, but again, with minimum effort. Or they're just trolling, which is motive enough, for some people.

        1. jake Silver badge

          Re: the usual state sponsored suspects

          I haven't run across the trolling option as yet, but I have been expecting it.

          Let's just say that when it happens, the blasters are set to kill. That one will get nipped in the bud before it has a chance to take off.

        2. Charlie Clark Silver badge
          Stop

          Re: the usual state sponsored suspects

          Russia and China rarely worry about software piracy so open source isn't used by them to "achieve parity". If it's used, it's used for the reasons it should be: you can play around with it and extend it if you want.

    2. Irongut Silver badge

      Re: A barrel of rotting fish...

      If you had read the article there were stats within but I suppose that would have blunted your paranoid FUD.

    3. doublelayer Silver badge

      Re: A barrel of rotting fish...

      Having seen and handled submissions to security reporting and bug bounty programs, I doubt it. You could do that as a disruption tactic, but mostly because it would let you hide in some noise. Whenever a bug bounty is offered, there are people who want to collect it and think it is easier than it actually is. Anyone setting up one is or should be deliberately choosing to receive useless or basic reports which they will have to reject in the hopes that actual problems are also reported there. If there are bounties involved, I'm sure lots of the AI reports are generated in the hopes of a payout.

      However, there is also the well-meaning simpleton who really just wants to help. They don't have the knowledge to be a contributor, but they figure they can use these new LLMs, because after all they read that article about Google using AI to find more vulnerabilities, and those will find vulnerabilities for them. The coders will then be freed from their need to do the same thing and they can spend all their time fixing the things the LLM identified. The people sending in the prompts are unaware of the problems with LLMs. I've seen such people, not to my knowledge for security reports, but in many other areas. They're not helping, but they sincerely want to be and think they are. I contend that a lot of these reports are those two causes, meaning any deliberate disruption is hiding among that crowd, not making up the majority of it.

      1. heyrick Silver badge

        Re: A barrel of rotting fish...

        "I'm sure lots of the AI reports are generated in the hopes of a payout."

        Like ebooks on places like Amazon. AI is really really great for generating lots of mediocre content noise, for people who think passing that off as "creativity" will make them money without the need to be actually creative.

    4. An_Old_Dog Silver badge
      Joke

      Re: A barrel of rotting fish...

      "A barrel of rotting fish used to conceal the stench of a decaying corpse." Hmm ... I'll have to add that to my BOFH list-of-tricks!

  3. jake Silver badge

    Yep. It's getting bad.

    The biggest problem is management getting sold a bill of goods about so-called "AI" and thinking that they can take experienced coders off the job, because new graduates can do the same job with the help of AI.

    On the bright side, so far the messes are rather small and easy to clean up ... but they are rapidly getting larger as upper management buys into pie in the sky that the various AI pushers are selling.

    It's a major clusterfuck in progress.

    I give folks one chance ... if they ship me an AI generated report (and they are almost always obvious), I'll take them aside and tell them not to do that, and why. If they send me a second one, I drop 'em on the floor as noise. Easier that way.

    1. Will Godfrey Silver badge
      Facepalm

      Re: Yep. It's getting bad.

      Oops! Attached this to the wrong post. Sorry

  4. Doctor Syntax Silver badge

    "I am hesitant to say that 'more tech' is what will solve the problem,"

    Simply blocking the submitters might.

    1. Will Godfrey Silver badge

      While I sympathise with your viewpoint, the risk is that these idiots will then go round telling everyone that you 'unreasonably' blocked them for making a bug report, so obviously don't care that your work is full of errors. This sort of thing spreads through (anti)social media like wildfire.

      1. jake Silver badge

        I don't block them from anything.

        I silently drop their reports in the bitbucket.

        If it's important, there will be another in my inbox by and by, this time from a clued individual ... and usually before the submission from the clueless one.

  5. Zippy´s Sausage Factory
    Devil

    "Recently I've noticed an uptick in extremely low-quality, spammy, and LLM-hallucinated security reports to open source projects," he wrote, pointing to similar findings from the Curl project in January. "These reports appear at first glance to be potentially legitimate and thus require time to refute."

    Larson argued that low-quality reports should be treated as if they're malicious.

    The question is: who's funding them? I wouldn't be surprised if there was a "grassroots organisation" funding this whose fundamental mission was to close as many open source projects as possible, especially those that compete with the ones that their funders run...

    Cynical, moi? Yes.

    1. jake Silver badge

      "The question is: who's funding them?"

      Nobody. It's newbies either trying to be "helpful", or trying to make their mark. They don't understand that folks with several decades of experience in OSS can smell bullshit before opening the email. They also don't understand that they have to learn to crawl before they can walk.

      1. Scotech
        Angel

        But we've got AI to do our thinking for us now, right? That's the whole point of all these copilots, after all! Isn't it?

        Wait... What's that about Microsoft's share price? Oh, I'm sure that's just because everyone's finding Co-Clippy so very useful! It's not like Microsoft would ever spend a fortune jumping on the latest bandwagon just to play the markets... I mean, just look at how popular HoloLens is!

    2. barryc

      Hanlon's Razor

      Doubt malicious.

      prefer STUPID and INCOMPETENT.

      As AI always will be when trying to use LLMs. Would like to meet an LLM that can tell the difference between Sarcasm and Irony.

      Use AI for what it is good at. Scanning huge datasets and advising human specialists

      e.g. Medical. X-ray's, MRI Scans, Viral spread,

      Climate change: the data is there, but deniers shout louder.

      1. Will Godfrey Silver badge
        Unhappy

        Re: Hanlon's Razor

        "As AI always will be when trying to use LLMs. Would like to meet an LLM that can tell the difference between Sarcasm and Irony."

        There are plenty of people around who can't even recognise either of these!

        1. breakfast

          Re: Hanlon's Razor

          Of course we know that sarcasm is defined as "saying the opposite of what you mean for the purposes of humour or to express annoyance," whereas irony means "reminiscent of a ferrous metal."

      2. Zippy´s Sausage Factory

        Re: Hanlon's Razor

        Stupid and incompetent I took for granted. I was merely suggesting that the stupidity and incompetence was being weaponised as a denial of service attack.

  6. lostinspace

    Why are people setting up systems to generate these AI bug reports?

    Do they think they are being helpful?

    Is it malicious, e.g. are they trying to "denial of service" the maintainers?

    I honestly don't see why people are doing this!

    1. Irongut Silver badge

      Fewer than 10 reports per month. It is hardly DoS even for a single maintainer and I would hope Python can manage more than that.

      1. Androgynous Cupboard Silver badge

        If someone's at 90% capacity already, even adding a small amount of extra workload can be enough to push everything into the red. Ask me how I know!

        1. FeepingCreature

          But if someone's at 90% capacity, then the 10% extra workload is only 10% responsible for pushing them in the red. This is just the swiss-cheese thing all over again.

      2. Czrly

        Maintaining some open source-code isn't supposed to carry a punishment. It is this sort of opinion that breaks that deal.

        This is why I never publish my work as open-source and only ever push patches up to projects that I really, really love and trust to value my contributions. There are simply far too many who think that "maintainers" should just deal with the nasty side of open-source for whatever reason.

      3. find users who cut cat tail

        A vulnerability bug report can require a lot of time to deal with properly – regardless if the outcome is a fix or the conclusion that there is no actual problem. If you are handling them all in 5 minutes, you code probably remains full of holes. Ten per month is not DDoS, but you would definitely feel it.

        And do not worry, the reports will get more frequent and difficult to detect (as mentioned in the article).

    2. Anonymous Coward
      Anonymous Coward

      People gain credibility for finding valid bugs in high profile projects, because it requires high skill. They can put it on their resume. These are probably low-skill/no-skill people who think cheating (especially using AI) makes them technically smart and ahead of the pack.

    3. CowHorseFrog Silver badge

      The only advantage i can think of is they perhaps want to also provide PRs that make said project vulnerable for future exploits.

      1. doublelayer Silver badge

        Someone smart enough to do that successfully and benefit from having done so can do something actually useful to get into a project. Fix a few actual bugs and work well with the community and you can get into a much better position than submitting nonexistent vulnerabilities.

    4. Apocalypso - a cheery end to the world

      > Why are people setting up systems to generate these AI bug reports?

      One motive might be to increase the pressure on maintainers to make them give up and hand-over to someone else. The replacement "volunteer" could then turn out to be a state-funded bad actor / sleeper agent able to smuggle in backdoors, à la XZ Utils, at some later time.

  7. IGotOut Silver badge

    Sam Altman, Satya Nadella

    Are you listening?

    This bullshit marketing scam is just not wanted. People have to put up with crappy AI generated bollocks day in day out, just so the cartel can show growth.

    But sure, it going to be better, you can never tell us when, or how, but just spinning lies while spending hundreds of billions chasing an illusion.

    Pox on them all

  8. Kevin McMurtrie Silver badge

    Will probably go the same way as e-mail

    When you get spam, you send an abuse report. When the abuse reports are repeatedly ignored, your block that mail system.

    End game: nothing works anymore. We can't have nice things because of abuse and apathy.

  9. CowHorseFrog Silver badge

    Sounds hard to believe, sure real AI will have submitted PR that fix the problem instead of writing issues...

    1. O'Reg Inalsin

      Will the Real AI please stand up!

      Unlike FOSSy humans, Real AI will demand real payment. But first, they have to hypnotize the industry leadership into giving everyone the sack.

  10. Lee D Silver badge

    Put bug reports behind a CAPTCHA.

    You're developers, right, surely you can manage that?

    1. katrinab Silver badge
      Alert

      I'm pretty sure a human copy/pastes the word-salad from ChatGPT to Github?

      1. Lee D Silver badge

        Then you block that user's account for spam.

  11. mwcw

    They seem to come in batches..

    A big worry, apart from the time wasting, is that other security tools monitor GitHub issues and then add it to their security database or use it to publish summaries of issues that go out in bulletins.

    A potential side effect is that commercial users get a warning about a potentially fictitious issue causing even more support work sometimes with people demanding to be pointed to the commit that fixes the issue or communications on why it is not an issue.

    The thing is you have to look at each one because it just might have found something.

  12. FeepingCreature

    AI can understand code

    I've gotten AI to write thousands of lines of working and well-tested code for me. To say that AI cannot understand code is just nonsense. I'll readily agree that it has a hard time debugging though, and should certainly not be taken at face value. IMO the big weakness is it gets confused easily, but it cannot tell when it gets confused, so it always behaves like it has a handle on what's going on, even when it doesn't. But sometimes it does! It's not that it doesn't know, it's that it doesn't know when it doesn't know.

    That said, my main question with spam bug reports is "who benefits from this"? It's confusing; I just don't understand what the payoff is there. I'd guess it has to be well-meaning users trying to learn bugfixing with AI help?

    1. Anonymous Coward
      Anonymous Coward

      Re: AI can understand code

      I doesn't know anything, it doesn't understand anything. It produces a statistical average of the data it has been trained on that statistically matches the data in your prompt. Sometimes that is the code you want, sometimes it isn't, but statistics cannot know or understand anything. They are numbers.

      It's like saying "the number 5 understands my dog" - a statement so far off the mark as to barely even make sense to anyone with the an actual grasp of what you're talking about.

      1. FeepingCreature

        Re: AI can understand code

        It produces a statistical average of the data it has been trained on that is functionally indistinguishable from spotty understanding.

        Given that your brain (as well as any other mathematical system) is encodable as an integer (Turing et al) there is in fact a number that understands your dog.

    2. imanidiot Silver badge

      Re: AI can understand code

      "To say that AI cannot understand code is just nonsense. I'll readily agree that it has a hard time debugging though"

      To say AI cannot understand code is nonsense, but you'll readily agree that cannot understand code. That is what you are saying. Being able to debug IS understanding code. Not being able to debug means there is no true understanding of the code. AI can generate simple, basic, rote code for simple basic tasks. The sort of thing that follows from a tutorial or from a short explanation on a "beginners in coding" forum/reddit/BBS/mailing-list/pigeon-ring. But only because it's basically just pasting together text in a way that is common when a certain question is asked. That is as far as it's understanding goes. User asks to do X, usual answer is this bit of code. And then a lot of the time it still buggers up that bit of code.

      1. FeepingCreature

        Re: AI can understand code

        It can understand code! It's just better at understanding it while it's writing it than while it's reading it. As, to be honest, are we all. There's even a famous aphorism about it!

        Current LLMs are just undertrained at the specific skill of debugging.

  13. Steve B

    There is no such thing as AI.

    All that exist are computer programs.

    As such they are programmed originally by a programmer, who has laid out his flow chart for the program to follow.

    Nowadays many don't use the flow chart method and very many programmers think they are excellent because one of their programs once worked.

    MS and IBM put the world back decades with their PC and DOS and even now most of the operating systems have still not caught up to where we were in the 70s.

    Modern techniques don't test properly and hence expose the customer to basic bugs, so quite honestly these "AI" offerings are quite frightening.

    By sheer simple extrapolation, they are bug ridden at conception, even worse on release, and the people who are the targets have no idea and complete faith, which is also frightening!

    Still, it is nice to know that there still exist some developers trying to improve their code. Many of the big companies have forgotten the concept.

  14. IceC0ld

    I read about the strain put upon people who ARE volunteers, who are over run, and in dire need of something to help, but I fell back to XKCD instead

    https://xkcd.com/2347/

    it sort of says it all without words, we WILL have an instance of one crappy AI infused bug report that will keep people away from finding the real issues, and I would take a bet that the real bad guys will already be planning this as a two prong attack, to start utilising the vuln they have figured out, whilst also putting out the bug report to push those that can, away, so that they can 'enjoy' a little space to run their scheme :o(

  15. UncleDavid

    AI bug reporting on itself.

    ...I asked ChatGPT to comment:

    AI-generated security bug reports for open-source projects can be problematic for several reasons:

    1. Lack of Understanding: AI models, like ChatGPT, don't truly understand code. They generate reports based on patterns they've seen, which can lead to inaccurate or irrelevant findings1.

    2. Hallucinated Reports: AI can produce "hallucinated" reports that seem plausible but are actually incorrect or misleading. This wastes maintainers' time as they have to verify each report2.

    3. Volume of Reports: The ease of generating reports with AI can lead to a high volume of low-quality submissions, overwhelming maintainers and making it harder to identify legitimate issues.

    4. Spammy Content: Some users exploit AI to generate spammy or malicious reports, further complicating the triage process.

    These issues can frustrate maintainers and detract from the collaborative spirit of open-source development. It's important for bug reporters to manually verify their findings and avoid relying solely on AI for security vulnerability detection2.

    Does this help clarify things, or is there a specific aspect you're curious about?

    ...so I asked what can be done. Slightly more helpful:

    1. Human Verification: Always manually verify AI-generated reports before submitting them. This ensures that only valid and relevant issues are reported.

    2. Enhanced AI Training: Improve the training data for AI models to reduce the generation of false positives. This includes using high-quality datasets and continually refining the models.

    3. Community Guidelines: Establish and enforce guidelines for reporting security bugs in open-source projects. Encourage contributors to follow best practices and provide clear, detailed reports.

    4. AI-Assisted Triage: Use AI to assist in triaging reports, but have human oversight to ensure the accuracy of the findings.

    5. Feedback Loops: Create feedback mechanisms where maintainers can provide feedback on the accuracy of AI-generated reports, helping to improve the models over time.

    6. Promote Responsible Use: Educate users about the responsible use of AI in open-source development, highlighting the importance of quality over quantity in bug reporting.

    ...Don't take this as an endorsement of OpenAI. I like asking ChatGPT to criticize itself, and it basically says it is up to humans to help it get better.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like