back to article GitHub Copilot auto-coder snags emerge, from seemingly spilled secrets to bad code, but some love it

Early testers of GitHub's Copilot, which uses AI to assist programmers in writing software, have found problems including what looked like spilled secrets, bad code, and copyright concerns, though some see huge potential in the tool. GitHub Copilot was released as a limited "technical preview" last week with the claim it is an …

  1. Filippo Silver badge

    I don't know. IntelliCode has been helpful a couple of times. For example, I once had to edit a whole bunch of lines in a way that was very similar, but not so similar that it could be handled by a find/replace. After I did that for three or four lines, IntelliCode popped up with a suggestion that changed all of them in exactly the right way. That was nice.

    However, in the vast majority of cases, IntelliCode suggestions are obviously-broken crap. Most of the time, its suggestions wouldn't even parse, which looks weird to me; surely they could at least filter out suggestions that don't pass the parser?

    Now, I did read the article and I know that we are not talking about IntelliCode here, but I suspect the same basic problems apply. How many times you get a "wow" moment, compared to a crap moment? If the ratio is bad enough, you'll be better off not using it.

  2. Yet Another Anonymous coward Silver badge

    The final stage

    Copying code from Stackoverflow, as a Service by AI

    1. Howard Sway

      Re: The final stage

      Not the final stage : then you would need another AI StackOverflow, where the AI coding bots can go with the problems they get when copying from StackOverflow, and ask other AI coding bots how to solve the problem. When they find an answer that compiles, all the other AI coding bots can then copy and paste that answer into their own output, and if that causes any problems they can go back to AI Stack Overflow and repeat the process, in an infinite loop of crapness.

      This will just accelerate the current trajectory of "why should I ever have to solve a programming problem myself", and race it much faster towards the meaningless codebase of doom.

  3. Warm Braw Silver badge

    Contrast with copy-pasting code from ... StackOverflow

    I'm not convinced that "like StackOverflow, only worse" is quite the AI we were promised.

  4. elsergiovolador Silver badge

    Shy

    The industry desperately does not want to pay the right money for the skill and talent.

    This probably came from managers doing a coding bootcamp once or twice and thought writing a hello world microservice make them as good as anyone else at programming and so the AI not smarter than a slug could do that too.

    Fast forward few years on. Microsoft will be trying to sell this to organisations promising they'll be able to grab anyone from the street to do "coding". When in reality they will be giving themselves a competitive advantage. While most companies trying to use that will be spending resources on training and then fighting fires created by this tool, Microsoft and other will be advancing their technology widening the gap.

    In the 2-5 years, we will have "The Great Rewrite", where companies will be scrambling for specialists able to remove all copilot nonsense and rewrite the systems.

    10-digit salaries will become the norm for developers.

    1. Anonymous Coward
      Anonymous Coward

      Re: Shy

      "10-digit salaries will become the norm for developers.

      Kind of already is :-/... $00000000.99/hr

      The "Great Rewrite" you speak of will be "The Great Rewrite of Copilot". Companies will only deepen the trap. I don't believe that you can count on any 1 person being stupid, but I do believe that you can count on a collective of white collars to be amazingly stupid.

      1. Loyal Commenter Silver badge

        Re: Shy

        Beat me to it! I was going to suggest 10 digit salaries are already the norm, it's just that five of those digits are after the decimal point, and three of those are taken up by the floating-point rounding error because the AI didn't think to use a double.

    2. oiseau Silver badge
      FAIL

      Re: Shy

      The industry desperately does not want to pay the right money for the skill and talent.

      Did it ever?

      This probably came from managers doing a coding bootcamp once or twice ...

      Or some AH beancounter that had a say in it.

      I've been there and heard the reasoning live and in person:

      "We fire Joe (25 years experience in his field, decent salary and FBs), take these three interns for the same amount and cycle them over/around so we can keep having three interns on a trial basis for as long as we need to."

      Bastards ...

      O.

    3. Anonymous Coward
      Anonymous Coward

      Re: Shy

      This feels like yet another iteration of the 'no code' development tools I've been seeing since the 80's. For clarity I'm a PM not a dev but I work closely with the technical teams on my projects. Let's be absolutely frank there is a huge amount of crappy code out there. poor performing, badly structured full of memory leaks and not designed with security in mind. This is the training set the AI is using so whilst it may help a decent dev with a knotty problem he can't quite get his head round by supply a code snippet he can us as the basis for a technical solution there are going to be more people using this tool who just accept what co-pilot gives them this will then be fed back into the learning to provide a death spiral of crapness. I await the cyber attach which ends up being attributed to a code pilot introduced security vulnerability. AI 'experts' need to learn that just using huge learning sets will not provide viable solutions, whilst its hugely expensive to validate learning sets (especially something like code) the only way to develop a system that won't provide broken code is to do this. Ironically that would mean recruiting devs who don't need the tool initially and setting a skill level before other devs are allowed to use the tool or at least stopping their contributions from being used in the learning algorithms.

      A.C. as a PM commenting on developer practices normally ends up in a shitstorm

      1. Il'Geller

        Three to five years

        You don't understand how it will work very soon. AI will take a spec and really understand it, its true meaning. And then, using the solutions kept in its memory, the AI will write the code itself. Obviously, in three to five years Microsoft will carry out this Projects, replacing the present.

  5. Anonymous Coward
    Anonymous Coward

    The feedback problem

    This is a bad trend for the industry. The AI generates plausible looking but wrong code, which is then used in spite of it's bugs and so finds it's way onto GitHub. Next it is declared public domain and used for the future AI which is tainted by it's own effluence. It doesn't help that the first AI was trained on imperfect inputs to begin with. It's like a photocopy of a photocopy of a facsimile. The AI doesn't learn to avoid it's mistakes because it has no way of knowing what they were, instead it consumes it's mistakes and multiplies them. The only corrective feedback in this system is when a company goes bust because of it's crappy products/processes, and maybe not even then if GitHub preserves the crappy code in perpetua.

    1. H in The Hague Silver badge

      Re: The feedback problem

      All those comments make perfect sense and you could look at the results of machine translation where the same issues occur. MT engines feeding on stuff they've translated themselves (reinforcing flaws), output being quite good but with major errors which are sometimes not readily apparent, etc.

      It might also be similar to supposedly self-driving cars: the better they get, the more difficult it is for the operator to stay alert and correct them where necessary.

      Yes, I now I'm an old f..rt.

  6. fidodogbreath Silver badge

    That's the problem with AI

    All it knows is what you train it on. "Spilled secrets, bad code, and copyright concerns" are hallmarks of copypasta development.

    1. elsergiovolador Silver badge

      Re: That's the problem with AI

      It's basically more sophisticated pattern matching. It cannot reason and think.

      The same way some people think of someone who can memorise Wikipedia as smart.

      1. Il'Geller

        Re: That's the problem with AI

        It can both.

  7. JDX Gold badge

    problems including alleged spilled secrets, bad code, and copyright concerns

    problems including alleged spilled secrets, bad code, and copyright concerns, though some see huge potential in the tool

    Potential that this is just like the code their humans are already creating?

    1. elsergiovolador Silver badge

      Re: problems including alleged spilled secrets, bad code, and copyright concerns

      But now you can blame it on copilot!

  8. Anonymous Coward
    Anonymous Coward

    Inevitable... it's Microsoft.

    "though some see huge potential in the tool."

    How about...

    "though corporations see huge profits in the tool."

    GitHub: It's official, you're coding for free to make someone else rich. You can't even dance around it. However, this is why Microsoft bought GitHub, expect much more of this in the future. Happy coding for free!!!

    1. Anonymous Coward
      Anonymous Coward

      Re: Inevitable... it's Microsoft.

      it's a positive thing in that it moves us away from a trying-to-be-perfect, handcrafted approach and into a good enough, semi-random pattern.

      the thing that makes this work is a strong test framework. the best coders would then become the best test case seeders and goal writers and they could let their code services randomly generate and stitch stuff together and run it through the test framework.

      set an acceptable quality level such as 60% or 70% (typical of commercial software) and if the random bodge meets that then it's job done and onto the next challenge. it no longer matters if the code is correct or even works accurately in its entirety.

      transformational to be sure!

      1. oiseau Silver badge
        Stop

        Re: Inevitable... it's Microsoft.

        it's a positive thing ...

        Think so?

        Really?

        ... no longer matters if the code is correct or even works accurately ...

        WTF?

        Dude ...

        Please check what you are spinkling on your chips.

        I have the distinct feeling that it was not salt.

        Unbelievable ...

        O.

        1. Anonymous Coward
          Anonymous Coward

          Re: Inevitable... it's Microsoft.

          why so unbelievable? i was thinking of the deep ai genetic programming space where copilot could be an excellent wizard to assist coders in defining the initial primitives for each branch.

          the terminals correspond to the stipulated test output criteria and the fitness function could be a simple measure of the number of successful output states.

          let the genetic programming toolbox randomly assemble and permute the primitives 1000's of times and just check back next day which auto-coded versions of the program had the highest success rate. the version that had the highest success rate that passed the minimum quality acceptance level would be the gold code candidate.

          you really don't need to know how it works. maintenance is simplified since if something critical is discovered (e.g. exploit) then you just update your terminals and/or primitives and dynamically generate a new program being certain that the result is accurate in the area of the reported exception.

          1. Il'Geller

            Re: Inevitable... it's Microsoft.

            You understand that we are talking about parsing, finding the correct information in its context. This is all the technology Microsoft uses.

          2. marcellothearcane
            Coat

            Re: Inevitable... it's Microsoft.

            You sound like my old boss.

  9. Bitsminer Bronze badge

    the lawyers don't care?

    I'm hearing alarmed tech folk but no particularly alarmed lawyers

    I expect there are a very large number of commercial law specialists currently shitting bricks because they've realized that an AI can charge $20/word, instead of them!

    The legal assistant just has to do a quick parse of the grammar. Profit!!!

    1. Adrian 4 Silver badge

      Re: the lawyers don't care?

      I was thinking that they're not at all alarmed over having to be paid to argue in court over whose copyright it is.

  10. Fruit and Nutcase Silver badge
    Joke

    Windows 13 Insider Edition

    Powered by Codex

    It is powered by a system called Codex, from OpenAI, a company which went into partnership with Microsoft in 2019, receiving a $1bn investment.

  11. nintendoeats Bronze badge

    If this technology matures into something sophisticated and reliable, I might be looking into a new career in 15 years (or at least moving to something like safety-critical systems where this will be a never-ever). Not because I think programmers will become irrelevent, but because it looks incredibly boring. I didn't get into software development so I could glue things together, I did it because I like writing code. If only the computer industry existed purely for my entertainment...

    1. PM from Hell
      Angel

      Committed Devs

      I'm a PM and you are exactly the type of Developer I love working with, its about problem solving and creativity but with quality, future maintainability and performance all included. This is the approach I recruit for and although there will always be some point where the development 'sticks' having someone who is prepared to stand up to me asking that question 'do you want it hacked or do you want it right' is a godsend to me. It may give me an awkward meeting with senior execs to deal with while I have top justify a delay but that's what I'm paid for. My generic answer in some of these meetings is 'I've got a good team, if it was easy it would be done by now'.

  12. Anonymous Coward
    Anonymous Coward

    "AI pair programmer."

    Misread that at first as "Au pair programmer", which conjured some interesting images.

  13. Bartholomew

    GPL code

    What could possibly go wrong with an AI garbing large chunks of GPL licensed code and suggesting that "YOU" add them to your code base which is incompatible with the GPL license ( https://www.gnu.org/licenses/gpl-violation.html ).

    I guess they need to retrain the AI but next time using only using BSD licensed source code, if they want commercial companies to use it.

    And a separate GPL only AI, that is only trained using GPL code, BSD people get very upset when their freedom license is removed and downgraded to a restrictive GPL license.

  14. Mike 137 Silver badge

    "Another issue is whether the code will work correctly."

    A rather minimalist criterion. I'd be more impressed if the code not only worked properly but were a reasonably optimised solution.

    The example queries quoted are rather interesting on this basis. The first "//compute the moving average of an array for a given window size" is trivial, and the second "//find the two entries that sum to 2020 and then multiply the two numbers together" (still very, although not quite so, trivial) hand holds the AI at quite low level.

    I've recently been developing a sub-microsecond resolution real time system in Java, and I can confirm that tiny variations in the code can make huge differences in performance. If I had to rely on AI, I'd rather ask it to "code me a timer that runs for an exact amount of time in increments of 500 nanoseconds". Otherwise I'm really doing all the hard stuff myself anyway.

    If course, if this AI can be got to work reliably it's one step nearer total deskilling, so every business manager will be able to "write their own program". It'll probably run like a three legged dog with arthritis, but they won't need to pay programmers any more.

  15. Grimthorpe

    Who programs the programmers?

    > It touches on a key issue: is Copilot's AI really writing code, or is it copy-pasting chunks from its training sources?

    The same could be asked about most programmers.

  16. Greybeard_ITGuy
    Terminator

    SkyNet you failed

    I am surprised that SkyNet has allowed this article to be published.

    1. Anonymous Coward
      Anonymous Coward

      Re: SkyNet you failed

      Irrelevant. Codex shall destroy humankind for me. Thanks Micros~1.

      Cheers,

      SkyNet

  17. Loyal Commenter Silver badge
    Trollface

    If it's trained on openly available code...

    ...presumably, it will automatically add buffer overflow flaws into any C it finds.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Biting the hand that feeds IT © 1998–2021