back to article Google teases AlphaCode 2 – a code-generating AI revamped with Gemini

Google's latest code-generating model – AlphaCode 2, powered by its Gemini Pro system and making its public debut on Wednesday – reportedly scored above the 99.5 percentile of participants competing in programming contests online. Researchers from Google DeepMind fine tuned Gemini Pro on a dataset to beef up its problem- …

  1. Johannesburgel12

    Yep, the numbers are horribly bad

    It generates a million random answers that are then cut down to a single one, and that one is wrong 57% of the time even in an artificial programing contest situation. Repeated ten times over it might cough up something correct, but who chooses which of those ten attempts is correct? Humans.

    This proves Gemini does NOT have the reasoning skills people are trying to attribute to it. It still sucks extremely hard at generating answers, there's no reasoning or understanding there, just randomness. The improvement is in the step that filters out the one answer that's most likely to fool a human into believing it is correct. And even then the human has to do the actual work in the end by running the model several times and filtering out the one answer that is, by sheer luck and the law of large numbers, actually correct.

  2. Anonymous Coward
    Anonymous Coward

    Was the Coding Training set

    attributed to an infinite number of Code Monkeys efforts from Stack Overflow?

  3. HuBo

    Viva la coding gibberish!

    AlphaCode 2 seems to be generating code in the exact opposite way to the methods we have been taught to use when writing computer programs. I'm not sure that I would ever want to advertise, never mind release, a product like that.

    1. Mast1

      Re: Viva la coding gibberish!

      And does it leave the code well documented so that in 40 years time the same functionality/algorithm can be ported to a new platform or programming language, and verified?

      It would save having to keep COBOL programmers in suspended animation.......

      1. Anonymous Coward
        Anonymous Coward

        Re: Viva la coding gibberish!

        Not sure about AlphaCode2, but pretty much every LLM I've used for coding adds comments to the code and there are plugins for VSCode that enable you to use AI to generate documentation from code and whilst not perfect, they are pretty good and take almost all of the pain out of documenting stuff.

        Through my own testing (and subjective opinions on such things) I have found Mistral Instruct to be the best balance between functionality and readability of code. Code Llama is pretty damned good as well if a bit schitzo at times (every now and then it'll start having a conversation with itself about booking flights or something and it will go on forever arguing with itself, Mistral doesn't do this).

        1. HuBo

          Re: Viva la coding gibberish!

          "conversation with itself about booking flights"

          Scary stuff ... oddly reminiscent of 737 Max ... obliquely ...

          1. Pseudonymous Clown Art

            Re: Viva la coding gibberish!

            Indeed. It is a bit creepy...but it doesn't happen with every LLM server daemon I've tried...pretty much just GPT4ALL which is a bit rough and ready anyway.

  4. tiggity Silver badge


    Interesting choice of C++, given the push for more memory safe languages such as Rust.

    (e.g. also from today

    Even other old, commonly used languages, such as Java, C# etc. would have been a better choice than C++ in "safety" terms

    I'm guessing that choice is because its far easier to slurp lots of C++ code samples ( & lots of "code challenges" have C++ as a language that can / should be used)

  5. FeepingCreature

    If you can do it once in a million, you can train on it.

    As long as probability of success is finitely greater than zero, it can usually be engineered to approach one.

  6. abend0c4 Silver badge

    A lot more remains to be done

    I imagine they're still working on how to make the code detect it's running in a Google environment and therefore limit its operational life to 18 months.

  7. shazapont

    The brute-force Monkey Method…

    “AlphaCode 2 also operates very differently from biological programmers. Given a problem, it generates about a million different code samples, which are then filtered down.”

    …so that’ll be the brute-force monkey method?

    That isn’t to say it’s a bad idea, but clarity of approach and results shouldn’t be made at the expense of marketing and competitive business requirements, surely. Wool, eyes, pulling over, etc…

    I prefer to know what they’re selling.

    — Shazelle DuPont —

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like