back to article AI code helpers just can't stop inventing package names

AI models just can't seem to stop making things up. As two recent studies point out, that proclivity underscores prior warnings not to rely on AI advice for anything that really matters. One thing AI makes up quite often is the names of software packages. As we noted earlier this year, Lasso Security found that large language …

  1. CowHorseFrog Silver badge

    Someone should tell the AI that generated this story too include the wacky names their brothers generated would be an interesting read.

    1. Irongut Silver badge

      All 205,474 of them? Maybe in an external link but no-one wants that in the body of an article.

      1. CowHorseFrog Silver badge

        No 205,473 should be enough.

    2. Blazde Silver badge

      Seeing as they're just package names that seem plausible to the AI model, they are presumably no more interesting or whacky than a list of real package names

    3. katrinab Silver badge
      Alert

      They are not obviously wacky names. They are the sort of package names that could exist, and if you are not intimately familiar with what's available, they sound very plausible.

  2. abend0c4 Silver badge

    If security is dependent simply on the name of a piece of software...

    ... then a random name generator is a symptom, not a cause, of the failure.

    1. AndrueC Silver badge
      Boffin

      Re: If security is dependent simply on the name of a piece of software...

      This is a problem I was aware of decades ago. Back when I was writing software in Borland C++ and explicitly loading my own DLLs (the safest and least troublesome way). We had a standard module that all applications/DLLs included. This handled the location, verifying and loading of DLLs and performed a handshake test during API initialisation.

      Now this was written a long time ago and the two verification processes (MD5 of DLL stream, exchange of MD5 between application and DLL) could probably be faked but then I was not writing critical software. I was just writing data recovery software and had an innate suspicion of anything 'not invented here'. The idea of just loading a stream of bytes into process memory space and calling into it worried me.

      I guess I was always just a suspicious git..

      Explicit DLL loading also meant we preferred to have a few API functions that took command IDs rather than dozens of exported functions. This made tracing easier and also meant only a few places to put validation checking. It also made versions changes less likely to break and even allowed us to produce an RPC-like module for remote execution.

  3. Pete 2 Silver badge

    Imaginative theft

    If LLMs are making up non-existent package names, how does that sit with accusations they "steal" other coders' work?

    1. Anonymous Coward
      Anonymous Coward

      Re: Imaginative theft

      ..That they are both stealing other coders work AND making things up? It is possible for two accusations to be true.

      1. Pete 2 Silver badge

        Re: Imaginative theft

        Presumably the "stolen" code does not contain made-up library names. So the code that does contain them must be the sole work of the AI. A mark of originality <grin>

        1. Anonymous Coward
          Anonymous Coward

          Re: Imaginative theft

          Maybe code helpers (and LLMs) pass their outputs through auto-thesauruses so they "differ" from verbatim-memorized training inputs (while otherwise being copies of them). Applying this to your 1st sentence as an example:

          Seemingly the "purloined" edict fails to encompass concocted archive monikers.

          That could explain (cheaply) why library names end up hallucinated (eg. enriched by thesaurus-equivalence reality augmentation) ...

  4. cyberdemon Silver badge
    Mushroom

    In other news

    Malware written by LLMs makes up the same package names, and auto-publishes to PyPi et al

    "Don't use AI for anything important" - the last part of that advice is redundant. Is your job important? Is the security of your home computer important?

  5. Bebu
    Headmaster

    I wouldn't trust AI to correctly...

    implement a bubble sort let alone any algorithm more complex.

    If AI is slurping its coding skills from the unwashed of the internet then among the correct implementations of any non trivial algorithm there are also oodles more not so correct implementations to poison the well.

    I would be extremely impressed if I were to give an AI assistant a (formal) specification and it were to return the algorithm(s) used, an implementation with a (formal) proof of correctness with respect to the specification.

    I would dearly love to be extremely impressed in such a fashion (and in this life.)

    1. elsergiovolador Silver badge

      Re: I wouldn't trust AI to correctly...

      LLM is still much more capable than most junior developers.

      1. O'Reg Inalsin

        Re: I wouldn't trust AI to correctly...

        DNA is a program too, and a pretty amazing one at that. Earth, evolution, variety, curiosity, creativity, continuous learning - just to throw out a few keywords. Even the humble tardigrade has more future potential for long term growth than OpenAI, or even the human species.

      2. katrinab Silver badge
        Megaphone

        Re: I wouldn't trust AI to correctly...

        Nope. Strongly disagree.

        LLM is about as capable as the most idiotic loud-mouthed politician you can think of.

        1. Anonymous Coward
          Anonymous Coward

          Re: LLM is still much more capable than most junior developers.

          "...the most idiotic loud-mouthed politician"

          TheDonald, bless his grandpa soul, came top on DDG image for that phrase of yourn.

          In the playground, @katrina walks up to @else and says their Mum is a wh0re. And @else says not just a wh0re but a pimp. and your sister and mum are her busiest tricks.

          Translation: Most junior developers are returds. Not their fault - like students. the best grow out of it. So, ChatGPT with that new focus model is much better than 80pc of them. That is good enough.

          The focus model from ChatGPT (on the app called Investigate or something like that) uses an IQ model to pitch the prompt responses to the retard's level. We did some of it on you lot.

          1. Anonymous Coward
            Anonymous Coward

            Re: LLM is still much more capable than most junior developers.

            You will never know a hydrid LLM. It is part of that person. Different personas have been used throughout. This is the bosses.

            Coding is a dying art, im afraid. Like guitar playing, we can't really do much more. All that comes is a variation of Jimi Hendrix. All that is code is a variations on the themes that we 1st/2nd-gen internet made. Look at the p1ss poor #web3 and Web2.0 once the snake-oil saleman got their hands on it - without a doubt PWAs were whored to the extent that Firefox doesn't BLOODY do them anymore without install a MS .Net framework and nasty plugins that look risky at best.

          2. Anonymous Coward
            Anonymous Coward

            Re: LLM is still much more capable than most junior developers.

            we did it on you.

            And good news is that The Reg has a very valuable resource in you Returds. Better than the swarm of just above average on Reddit (80-100 are the most dangerous).

            /pol/ came out higher IQ on ave. but it is unreliable over 125-30 and not usable unless you like galactic-level ADHD.

            What The Reg can do is IQ profile the Commentards as advertising USP nonsense. The Wired US is closer.

            As an aside, a chatGPT plugin is near release that can 'lQ level' those who comm with you. Works on phone too. Makes life a lot easier to see a number next to the name. Bet it gets banned.

            1. jospanner Silver badge

              Re: LLM is still much more capable than most junior developers.

              Well we know you believe that IQ is a useful measurement so that’s helpful for anyone reading to classify you.

              1. Elongated Muskrat Silver badge

                Re: LLM is still much more capable than most junior developers.

                I'm not sure that this particular commenter isn't posting the output of some LLM, presumably trained on Twitter. Let's not even start with apparently defining an IQ of 80-100 as "just above average", when most definitions of a quantitative "IQ" specify it as a normal distribution with the average at 100 (actual intelligence is not, of course, a normal distribution, and is more akin to the sort of long-tail distribution you'd get as the product of a power distribution with a normal one).

                Still, these posts bear all the hallmarks of the sort of word salad waffle that "AI" produces.

              2. Anonymous Coward
                Anonymous Coward

                Re: LLM is still much more capable than most junior developers.

                IQ is a helpful measure, but only a small number of people are formally tested. However, simple intelligence doesn't imply fitness for any stated purpose.

            2. Anonymous Coward
              Anonymous Coward

              Re: LLM is still much more capable than most junior developers.

              You seem like a particularly bad LLM and reply to yourself. Your posts aren't even coherent.

    2. DanielsLateToTheParty

      "I would be extremely impressed if..."

      If you want to converse with a computer in a formal language then maybe give Prolog a go. Personally, I never felt comfortable with it.

      1. desht

        Re: "I would be extremely impressed if..."

        > maybe give Prolog a go

        No.

    3. captain veg Silver badge

      Re: I wouldn't trust AI to correctly...

      Indeed. By definition anything produced by inference from the entirety of the internet is going to be of mediocre quality at best.

      -A.

      1. Anonymous Coward
        Anonymous Coward

        Re: I wouldn't trust AI to correctly...

        "...mediocre quality at best."

        Which points to the obvious fact that most people are mediocre. Based on that, AI does a great job. It is what 'we' are. We laugh at its faults not realisng we are mocking ourselves too.

        95% of peoples are, on the whole, mediocre with flashes of brilliance in something that isnt of great monetary value but still valued.

        1. Jimmy2Cows Silver badge

          Re: I wouldn't trust AI to correctly...

          I wouldn't say that. It mainly points to the fact that most high quality code is not posted freely on the internet. Mostly it's a mix of dubious SO questions and their varied resplies, and stuff on some random's public githuib. Hardly representative of the best coding humanity has to offer.

    4. O'Reg Inalsin

      Re: I wouldn't trust AI to correctly...

      Various AI's have now reached the level where a query about an algorithm will yield summaries/descriptions of various flavors an algorithm, pseudocode, and some actual references, although it might take a few iterations of phrasing the question to get that information. Used in conjunction with traditional web search, and chasing up references in the references, it is helpful - more helpful than simply writing code with no explanation. It's not what you were asking for, but it's what exists now.

      My concern is what happens when the AI bubble bursts - how much of it is economically viable? It is apparently a gas guzzler. IMO, if AI (in the broader sense) development were being shaped more by economic realities, there would be more emphasis on practical and profitable usage and that would in better and more sustainable growth.

      1. Anonymous Coward
        Anonymous Coward

        " It is apparently a gas guzzler. "

        Makes my old man's Mark 3 Jag V12 seems like a Nissan Leaf.

      2. PB90210 Silver badge

        Re: I wouldn't trust AI to correctly...

        Caught an ad on the TV the other night where Amazon boasted they were the biggest commercial buyer of renewable energy...

        (unfortunately missed the critical bit that clarified where/when)

  6. elsergiovolador Silver badge

    Opportunity

    If LLM hallucinates a package name, then ask it to describe its API in full and then create an implementation for it.

    Put it on GitHub and reap profits.

    1. Jonathan Richards 1 Silver badge

      Re: Opportunity

      In what way, do tell? This is just like the

      3. ...

      4. PROFIT!

      meme. Wasn't funny for very long back then, either.

      1. elsergiovolador Silver badge

        Re: Opportunity

        Did someone say "fun sponge"?

      2. stiine Silver badge

        Re: Opportunity

        Wrong, its still funny to this day. Especially the more obtuse 1. is.

  7. Anonymous Coward
    Anonymous Coward

    I am getting very bored of the never ending pitch for LLM's !!!

    The fact that LLM's 'invent' things is not news anymore !!!

    The problem is very well known !!!

    Why would anyone use an LLM to do *anything* critical !!!???

    The amount of 'meatsack based checking' that is required to *allow* the use of 'output' from an LLM makes the so called 'value' of LLM's decrease daily !!!

    The promise of LLM's has *NOT* been delivered in any way ... the bubble needs to burst now before something *Critical* is broken.

    Come back when the 'hallucination' problem has been solved 100% ... 50% is not good enough and we are not even that close to solving the problem !!!

    :)

    1. Anonymous Coward
      Anonymous Coward

      Re: I am getting very bored of the never ending pitch for LLM's !!!

      "!!!"

      Say no more. Really. I cd insult you and say you are amanfromArsesholes who has updated to a Speccy 48k LLM but I think you are real.

    2. Tim 11

      Re: I am getting very bored of the never ending pitch for LLM's !!!

      LLMs can be used in critical situations but only where it's possible for a human to evaluate the output.

      Several times I have used LLMs to generate a small snippet of code where it was easier than wading through a load of StackOverflow posts or API documentation (e.g. date formatting in java!!!)

  8. Howard Sway Silver badge

    AI code helpers : the Eric Morecambe of programming

    "I'm using all the right package names. Just not necessarily in the right program"

    1. Eclectic Man Silver badge
      Joke

      Re: AI code helpers : the Eric Morecambe of programming

      The classic (full) sketch:

      https://www.facebook.com/legendarymusicians2020/videos/morecambe-and-wise-andre-previn-the-full-sketch/575707570019833/

    2. Anonymous Coward
      Anonymous Coward

      Re: AI code helpers : the Eric Morecambe of programming

      better not then little Ernie.

  9. theOtherJT Silver badge

    I really wish "hallucination" hadn't stuck.

    It's an excellent bit of marketing-speak audience manipulation, but it's still bullshit. It's like "Right-sizing" when what you mean is "Layoffs" - a much friendlier term that's meant to make us feel better about what's actually going on but is really there to obfuscate the truth. "Hallucination" makes it sound like something we should empathize with. It also implies that it's not the system's fault. Hallucinations in humans are caused by illnesses or the interactions of drugs - things from the outside world affecting the poor human's brain causing them to perceive the world in ways other than it actually is.

    That is not what is happening here. The correct phrase is "Making things up" which LLMs do by design and isn't at all because some external influence is messing up their otherwise good intentions. "Hallucinations" in LLMs aren't some disturbing external force. They're part of the design. They're a feature.

    1. Bitsminer Silver badge

      Re: I really wish "hallucination" hadn't stuck.

      Well, "Liar Liar Pants on Fire" is a bit too childish for everyday use, even if accurate.

      Although "Seriously dude?" tends to work informally.

    2. kskropf

      Re: I really wish "hallucination" hadn't stuck.

      Full treatment of the misdirection in 'hallucination'.

      https://www.researchgate.net/publication/381278855_ChatGPT_is_bullshit

      Thanks Harry Frankfurt, sorely missed.

    3. Anonymous Coward
      Anonymous Coward

      "They're part of the design. They're a feature."

      there is a spot for you in my marketing team anytime, sunshine.

    4. Elongated Muskrat Silver badge

      Re: I really wish "hallucination" hadn't stuck.

      To put it in plain language, the bullshit generator is working a little too well.

      Given that humans have serious difficulties in agreeing, writing down, and verifying requirements, and it is the human intelligence of human developers that can unravel the mess and (sometimes) produce software that does what a client wants, which usually involves some degree of domain knowledge, and common sense, the odds of any sort of LLM that has been "trained" to produce human-like output, but lacks the fundamental essence of actual humanity, actually producing such useful work, beyond the task of producing boilerplate code (which can usually be automated anyway, if it's a tedious enough of a job to make it worth doing so), approaches zero.

      (tl; dr; - AI has no useful use case)

      it was the great Larry Wall who coined the three tenets of programming - laziness, impatience, and hubris:

      Laziness: If we can't be bothered to do those repetitive tasks (such as producing boilerplate code), then we write a tool to do it for us. Asking "AI" to do it leaves us with the job of checking and verifying it, which is even duller a job than writing the boilerplate in the first place, so the correct solution (not "AI") is to specify and write a tool that does it deterministically.

      Impatience: If the code takes too long to run, then the programmer will find a way of making it run faster. Sure, you could get AI to do it for you, or to optimise your algorithms, but again, you then need to verify the results. You'd be better off reading Knuth, and doing it properly. It's not like efficient algorithms have only just been discovered, and ones that AI invents are likely to either not be faster, or to cut corners, and produce incorrect or approximate results. I'd be astounded if "AI" can ever come up with any new categories of algorithm that are improvements on existing ones, based upon the inputs of existing algorithms, because that would imply a category of mathematical proofs that humans have somehow missed. Doubtful. Again, if, as a developer, you are unable to optimise your code, you need to go and learn how to. "Get gud, scrub".

      Hubris: Excessive pride, achieved by writing well organised, readable, maintainable, well-commented code. Yeah, "AI" ain't doing that. "ChatGPT, write me some gibberish to comment how and why this code does what it does and how it solves the domain problem". Yarp, narp.

  10. This post has been deleted by its author

  11. This post has been deleted by its author

  12. justanotherguynamedtony
    Go

    The mark of true intelligence

    Having never (knowingly) used an AI system, have any of them ever answered "Gee, I don't know the answer to that."?

    1. Richard 12 Silver badge

      Re: The mark of true intelligence

      Nope.

      Earlier today I was watching a junior use Copilot assistant to "help" write their code.

      Every invocation produced code that was either overly complicated, or flat wrong - fortunately, it was mostly inventing function calls that simply don't exist.

      It also generated ridiculously complex code that repeatedly called functions that don't exist.

      He spent more time deleting the suggestions than actually writing anything.

      1. Anonymous Coward
        Anonymous Coward

        Re: The mark of true intelligence

        it also miss interprets what you asked for and writes code it thinks you asked for (filled with bad ideas like obvious sql injections).

        and when you tell it it's wrong, it re-writes more garbage, sometimes it might even say sorry and repeat the wrong code again, at tht point you might be in deadend loop.

      2. 0laf Silver badge

        Re: The mark of true intelligence

        Wow an automated bloatware creator, the hardware slingers will be delighted

    2. doublelayer Silver badge

      Re: The mark of true intelligence

      Sometimes it does, but not enough, and only if your prompt takes a few forms. If you ask it a question, it might say it doesn't know, then tack on some suggestions, often not so useful, for things you could try. It's still a lot more common that it will try to answer your question with incorrect or irrelevant information. Sometimes, it even gets it correct. It isn't reliably wrong, just reliably unreliable. With substantial effort, you can get it to do some things frequently, but having seen how easy it is to get it to completely mess up, I would always be worried about the quality of anything it produced.

    3. Anonymous Coward
      Anonymous Coward

      have any of them ever answered "Gee, I don't know the answer to that."?

      "have any of them ever answered "Gee, I don't know the answer to that."?"

      yes. but not in the words of a Redneck though. Ask it ineffable questions. There was a threadd about this a few months back.

    4. Dom 3

      Re: The mark of true intelligence

      More to the point I have yet to see one ask for clarification - the canonical example being "tell me about Washington".

      A meatsack will ask - state, city, or person? (Or something else entirely).

  13. Locomotion69 Bronze badge

    LLMs are helpful, but don't use them for anything important

    The way this goes, the phrase needs adjustment:

    LLMs look helpful, but don't use them for anything.

    1. Anonymous Coward
      Anonymous Coward

      Re: LLMs are helpful, but don't use them for anything important

      upped but wrong.

      Whether AI is worth the energy burn is more the topic.

      I would say if locally on a pc/powerful phone... yes. 16GB with NVidia£200 & i7 > will deo it nicely for Ollama. Then you can load the models into it. Ollama does let you use other models like face recog and stick-framing

      Win11 works well but Mint is much fastest. much. And Docker doesn't consume memory like it does on Win11 on Mint.

      RAG it with your persona. You email, contacts, your phone. Everything you have ever done digitally. Including pictures. This will create your persona. Leave out stuff from RAG that is super personal until SI to SI comes in in about 2 years

      luvs and XX

  14. HxBro
    Facepalm

    Expensive random word generator

    …generates random words. Who’d have thought it.

    I spent a good part of my spare time annotating LLM responses to prompts to help train, quite often you’d see packages that didn’t exist but looked like perfectly sensible names, I’d have to then go see if it really existed and if one was found make sure it actually was real.

    I’m sure it wouldn’t take much to check the list to validate them before they were returned in a response but it would slow down the output a little.

    1. katrinab Silver badge

      Re: Expensive random word generator

      First it would need to understand that it was talking about an npm / pypi / whatever package

      1. stiine Silver badge

        Re: Expensive random word generator

        Why? Did you specify a particular language/version/platform?

  15. Zippy´s Sausage Factory

    Of course all this LLM comes with no warranty, express or implied, etc etc. So if you use it as a chatbot and it hallucinates deals to your customer that don't exist, you have to honour them. And the "AI" vendor gets off scot free. I just wonder how many similar lawsuits are brewing now that Air Canada lost...

  16. EricB123 Silver badge

    Campbell's Soup Mentality

    "Hallucinations are outputs produced by LLMs that are factually incorrect, nonsensical, or completely unrelated to the input task,"

    Wasn't one of Andy Warhol's more famous quotes "Machines make less mistakes than humans. I wish I was a machine".

    I just have to change with the times, I guess.

  17. trevorde Silver badge

    Easy way to check if a package is valid

    https://npmdrinkinggame.party

    1. stiine Silver badge

      Re: Easy way to check if a package is valid

      Does that list exclude the real but totally malicious packages?

  18. Byz

    LLMs can't be trusted.

    Hallucinations = Lies

  19. This post has been deleted by its author

  20. Anonymous Coward
    Anonymous Coward

    Or perhaps someone somewhere who's been unfotunate/careless enough to expose private repos has that package and the LLM has "leaked" it...

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like