back to article If you've mastered Python 101, you're probably better at programming than OpenAI's prototype Codex

OpenAI warned that its Codex neural network, like the one that powers GitHub’s code-completion tool Copilot, is likely to generate source that looks plausible but is incorrect, and its performance will decrease as it grows in size. The artificial intelligence lab revealed the shortcomings and limitations of non-production …

  1. Greybearded old scrote Silver badge

    Yeah

    We all know how that goes.

    1. nautica Silver badge
      Happy

      Re: Yeah

      I found only a few missed-steaks in the referenced document.

      The most egregious miss stake was using the proper "it's" instead of "its".

      And not using "mai" instead of "my". And...

      Overhaul, a very good piece of work; I've saved it for future reference.

      It should carry with it a caveat, however: if you're a millennial, or member of the Gen-X/Y/Z class of blockheads, you won't find one thing wrong.

      [the spell-checker here flags "mai" as a spelling error. It is absolutely wrong, Register]

      1. Flocke Kroes Silver badge

        Re: Yeah

        "Owed to a Spell Checker" and variations have been all over the web and emailed back and forth a very long time. The oldest I found with a quick search was 1997 and even that references previous versions from long since vanished cobwebs. The only reason gen X/why?/Z... would not be aware of it is if they do not spend much time on the internet.

        I am not convinced theregister has a spell checker (my browser settings would block it if there is one). You are almost certainly seeing the spell checker from your browser. You may be able to change the dictionary that it uses but as mai is not in british-english-insane that will not help you. (Do knot yews -insane four spell cheque king unless ewe wan tar Pullet Surprise).

        Wiktionary has entries for mai in many languages but it looks like the closest it comes to English is in Otaku dialect.

  2. Pascal Monett Silver badge
    Facepalm

    12 billion parameters, 159GB of Python source code scraped from [..] 50 million public repositories

    So, you created a Frankenstein monster of cobbled-together code requiring an AI to configure it, and you're surprised that it's not good ?

    12 billion parameters. That in itself amply demonstrates that there isn't a company in the world that has enough resources to configure this thing.

    Erase it, start over. Use humans to evaluate the code. Yes, it's more expensive, but it works better.

    We are not at the point where computers can code software for computers.

    1. This post has been deleted by its author

    2. Anonymous Coward
      Terminator

      Re: 12 billion parameters, 159GB of Python source code

      >We are not at the point where computers can code software for computers.

      Computer programs that can write software effectively have been around for many years, but you have to use the right algorithm.

      A neural network isn't the right algorithm, at least on its own, because all it really does is pattern matching.

      There are plenty of very effective techniques, such as genetic programming, although they are much harder to design and use and are very problem specific.

      1. Il'Geller

        Re: 12 billion parameters, 159GB of Python source code

        The idea is not to write code, but to find the needed piece of code based on its textual description. This is what OpenAI does — textual search.

        1. Anonymous Coward
          Trollface

          Re: 12 billion parameters, 159GB of Python source code

          The paper claims that Codex writes code. e.g. The first line of the abstract:

          We introduce Codex, a GPT language model fine-tuned on publicly available code from GitHub, and study its Python code-writing capabilities.

          1. Il'Geller

            Re: 12 billion parameters, 159GB of Python source code

            Playing with words: OpenAI is not able to really write a program, but only finds the right pieces of code based on their textual descriptions. And combine them in an order. Indeed, how can a company — famous for writing meaningless texts — suddenly write reasonable code? Where did the ability to write a logically accurate program suddenly come from? Mystery...

            OpenAI searches for textual descriptions.

    3. Jimmy2Cows Silver badge
      Facepalm

      Re: public repositories mis-parse

      I initially scanned that sentence as public lavatories. Apt, given the output quality.

  3. katrinab Silver badge
    Paris Hilton

    I could write something that searches StackExchange for the answer, and pastes in the first link it finds.

    That would probably have a better success rate than this thing.

    1. Someone Else Silver badge

      Pretty low bar there...

  4. 502 bad gateway
    Terminator

    You P45 is ready at reception, byeeee

    Funnily they swerved the ultimate goal for this tech. Reduce the number of actual programmers required by unspecified organisations. That goal appears way off, but that doesn’t mean it’ll never happen.

    It’s curious that they’re training with a high level abstraction like python (created for humans), wouldn’t it be better to train with compiled machine code. There wouldn’t be obscure syntactical elements to get in the way. I’m probably missing something.

    1. Flocke Kroes Silver badge

      Re: You P45 is ready at reception, byeeee

      There is a really old solution to reducing the number of programmers required to produce good software.

    2. elaar

      Re: You P45 is ready at reception, byeeee

      I agree, doesn't it seem counter-intuitive to get a machine to read a higher-level language (designed for humans and to handle memory management issues created by humans) problem and solve it in a higher-level language which is then compiled to machine code?

      I know this particular AI isn't designed to do that, but surely the best AI here would take a problem and create raw assembly or at least C?

  5. James Loughner
    Holmes

    Garbage in grabage out

    Enough said

  6. Jimmy2Cows Silver badge
    Boffin

    So it's useless, then.

  7. Anonymous Coward
    Anonymous Coward

    > To prevent harm in the real world, GitHub Copilot comes with filters that automatically block offensive words so it can’t spit out toxic outputs.

    Can we not imagine code doing anything more harmful than this? What about the millions of systems out there that would actually cause real harm if they had bugs?

  8. redpawn

    Shhh. Don't Tell!

    Microsoft is already using this in production.

  9. martinusher Silver badge

    A small snag.....

    My wife of many years has a bit of a love/hate relationship with computers (mostly hate, actually). We've had them at home since about 1980 and the novelty quickly wore off when she realized that she had to adapt her way of working to the machine rather than the other way around. Because she, as a teacher, knows about "computers" she got tapped to go on innumerable courses to learn all sorts of computery things, whatever people thought was what you needed to teach kids so the would be ready for the Bright New Age that was dawning. Despite all this immersion, though, she confided to me that while she knew the standard programming moves -- statements and control functions in the language du jour -- she never quite figured out how to program, how to marshal a problem so it could be implemented on a computer. Since this is something I've been doing for ever it wasn't easy to explain, either -- its been described as a "knack", its one of those things that is either blindingly obvious or completely opaque. (It doesn't help that I find a lot of program code confusing and opaque myself -- sure, its syntactically correct, at least the compiler thought so, but the logic's convoluted, the data's disorganized and there's too many wrappers on wrappers.)

    So there you have it. Any experienced programmer will tell you that the coding bit is easy. Its knowing what to code that will get you. (A lot of programming may be copy and paste but it will only get you so far.)

    1. elaar

      Re: A small snag.....

      I think cut and paste gets you a long way these days. There's a library for everything and very little has to be done, and you have almost unlimited amounts of memory to waste.

      I find the only areas where decent programming is still required is embedded stuff. Arm/Pic/STM etc, where you actually need to know what you're doing.

      1. sev.monster Silver badge
        Boffin

        Re: A small snag.....

        Absolutely not. One look at the disgusting state of IoT devices coded by malignant monkeys to the tune of 10,000 typewriters and you will see that the choice of arena has nothing to do with it. I, a hardware buffoon, could go out and buy any popular SoC and code it to talk on GPIO and do whatever I need it to, and the code required to do so can be in whatever language or paradigm I like. Sure, but what about more more deeply embedded projects that don't use off-the-shelf SoCs? In my experience, shitty industrial products come to mind that despite the hardware costing you tens of thousands of dollars, the software ends up sucking massive donkey foot innit? Stepping away from the embedded world exclusively, what about development team size and resources? Surely the more devs the better the product? No, because now look at how many millions in funding is poured into various government projects and see how many of them turn out good. Remember the healthcare.gov scandal where the site didn't even work properly for weeks after release?

        Just because you CAN download ten Node.JS libraries and you CAN easily staple together a disgustingly hacky, bulky single-page webapp that takes 10 years and 500MB to load on a smartphone doesn't mean you SHOULD. Just because you CAN outsource the development of your code to a cheap software house that has the collective knowledge of a three year old and a piece of moldy toast doesn't mean you SHOULD.

        Unfortunately so much of modern society have been inundated and desensitized to this way of thinking, both consumers and developers.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like