back to article IntelliJ IDEA plugin catches lazy copy-pasted Java source

Boffins affiliated with dev tools biz JetBrains and HSE University in Moscow have devised an open-source plugin for the company's Java development editor that guards against copy-and-paste coding. AntiCopyPaster, available on GitHub, works with IntelliJ IDEA, JetBrain's integrated development environment (IDE) for Java …

  1. A Non e-mouse Silver badge
    Windows

    Improvement

    He's a suggestion for an improvement - and it's simpler too:

    Has the code been copied from Stackoverflow? If so, mark it as a security issue.

  2. Joe W Silver badge

    mark it as a security issue refuse to paste it

    (though we all have been guilty of this...)

  3. _LC_ Silver badge
    Unhappy

    Contradictions

    Why does it feel like someone here is always stumbling on the same contradiction: on the one hand, you want compliant, inexpensive dogs that will follow along obediently with everything. On the other hand, these stupid dogs just don't get it done.

  4. tiggity Silver badge

    False positives?

    If lots of "work" being done in a method then huge amounts of scope to code it in a large variety of ways

    If a method is doing a well defined, relatively small piece of "work" (as they hopefully should be) then far fewer (reasonable) ways to code the functionality, and many of the ways will be variations on a theme to a human observer.

    So, does this tool ignore small and simple methods and only look at bloaty code? as if it looks at small methods it could erroneously report a false positive purely because when there's only a small number of ways to do something then your code may be very similar to someone else's code (coding version of convergent evolution)

    1. Cederic Silver badge

      Re: False positives?

      The description seems to suggest it's not looking at external code at all.

      It's looking at code you paste in, and comparing it to other code within your code base. Where it finds a match, it suggests that instead of copy/pasting the code multiple times you apply software engineering basics and create a callable method.

      There are occasions on which that won't be appropriate but anybody capable of understanding those will also welcome the IDE refactoring support to prevent the same code being replicated multiple times.

      1. A Non e-mouse Silver badge

        Re: False positives?

        It's looking at code you paste in, and comparing it to other code within your code base. Where it finds a match, it suggests that instead of copy/pasting the code multiple times you apply software engineering basics and create a callable method.

        I thought Jetbrains' tools already had smarts to highlight potentially duplicate code? (I'm sure I've seen a suggestion along similar lines in some of my projects already)

  5. Zippy´s Sausage Factory

    TurnItInBot for code? Hmm...

  6. Ignazio

    "Hi I have a hammer, this problem seems like a nail to me"

    There are tools to find duplicate code already; extension to verify sections are not part of copyrighted code? Nice.

    Get in the way of a developer doing their thing? Such as copy a bunch of lines where one word needs changing, or something else of that kind? I'm already reaching for the uninstall button. Eject the damn thing already.

    Company requires me to use it? CV polishing time.

  7. HildyJ Silver badge
    Linux

    Turnabout?

    An interesting idea, although, as others have pointed out, there might be problems in implementation.

    That said, I wonder if it could be used to find out specifically who's using my code?

    I'm not sure if discovering that nobody's using it would be good or bad.

  8. Sceptic Tank
    Big Brother

    Did they say how much Stack Overflow code they, themselves, copied while writing that tool? And I sometimes copy my own code from an unfinished artwork to a new idea*. It that also frowned upon?

    * My personal projects normally end right after the interesting stuff was done but before it can ship as the next ground shaking killer app.

    1. Brewster's Angle Grinder Silver badge

      They pay us to do all that boring shit. The fun stuff we do for free.

  9. Howard Sway Silver badge

    Does nobody do code reviews any more?

    How about making the senior developer on a project responsible for code quality, like they should be?

    When I did this, during my couple of decades in corporate programming, I was able to review the code regularly, keep the design and code quality in good shape by discussing these things with the other developers, providing encouragement and explaining why things needed changing, insisting on useful and usable documentation, and generally keeping some sort of ethos of producing good quality work.going.

    Relegate the value of experienced people who can do this down so much that you end up using an IDE plugin and other 'AI' gimmicks in the hope of getting decent code, and you deserve the costly and tedious maintenance nightmare you will inevitably end up with.

    1. msobkow Silver badge

      Re: Does nobody do code reviews any more?

      Yes, lets tie up a half dozen people in a meeting to discuss somebody's code style and nit-pick about their capitalization while pretending we're doing something useful.

      I have never seen a code review that delved into architecture and algorithms; all they did was review the compliance to beancounter rules about code formatting and the like. :(

      1. a_yank_lurker Silver badge

        Re: Does nobody do code reviews any more?

        Where I work code reviews actually catch coding errors. While we have internal conventions, they are not used a end all to compare to.

        1. Falmari Silver badge

          Re: Does nobody do code reviews any more?

          @a_yank_lurker “Where I work code reviews actually catch coding errors.”

          The same where I work. But it is more than that, it is an exchange of knowledge and ideas.

          Where I work every code change must be code reviewed no matter how small. When a developer needs their code reviewed, they just ask the other developers on the team for a code review which normally requires just a single reviewer.

          Unless the code is very simple the review will be a walkthrough to catch coding errors. Funnily more often than not, it is the coder that spots errors first. Nothing like explaining your code to focus your mind.

          But catching errors is not all that happens. Put two coders together with some code and they will talk about it, question how it works, suggest other ways of doing it. It’s just the nature of coders.

          Often when my code is reviewed, I get questioned on why I did this or that that way, with the reviewer suggesting (in their view) a better way, sometimes I learn a better way sometimes it is the reviewer. The reverse is true when I am the reviewer.

          The point of a code review is not to find fault but to produce good code. If both the coder and the reviewer take that view, chances are they both may learn something.

      2. TheRealRoland
        Big Brother

        Re: Does nobody do code reviews any more?

        Well, there's your problem...

        To state that code reviews are not useful because 'we always get lost in minutiae', maybe ask the question 'is this the correct way to conduct code reviews'?

        And if the answer is 'Yes', run away...

      3. claimed

        Re: Does nobody do code reviews any more?

        Formatting is easy. You pick a formatter and run it prior to check in, as part of CI, whatever. If you can't customise it and desperately need that brace in a different place, write an add on or get over it. Everybody uses the same and no exceptions....

        Right... let's talk about the logic in the actual code.

      4. Someone Else Silver badge
        Boffin

        @msobkow -- Re: Does nobody do code reviews any more?

        I have never seen a code review that delved into architecture and algorithms; all they did was review the compliance to beancounter rules about code formatting and the like. :(

        Sounds like you need to be introduced to better reviewers (or perhaps become one yourself?)...

  10. Tom 7 Silver badge

    While I'm all for proper code attribution

    I have a collection of code designed to create code for various purposes for me. I normally cut and paste this into various IDEs. I wonder if I can just create files from it and import them?

  11. captain veg Silver badge

    copypasta

    Is that the new spaghetti code?

    -A.

    1. bombastic bob Silver badge
      Coat

      Re: copypasta

      Spaghetti code yes. The 'sauce' code was written on a computer labeled 'Tomato' by a hacker doing the mushroom samba, while reading "The Onion": and adding garlic to taste. Yeah that was pretty cheesy. Now all your 'basil' are belong to us. THAT's using your noodle!

    2. spireite Silver badge
      Coat

      Re: copypasta

      If they want this to take off, will they run an advertising Campanelle.

      Not clear if it's only Java or other Linguines are supported

    3. Jim Birch

      Re: copypasta

      Antipasto would have been a better name.

  12. Simian Surprise

    > ... AntiCopyPaster will run the snippet through its onboard Gradient Boosting Classifier model to check whether it's a suitable candidate for refactoring (revision) using IntelliJ IDEA's built-in Extract Method.

    They do know that this doesn't stop it from being a "derivative work", right? The licensing concerns are all still there.

    If they really wanted to do some fancy analysis they should go and check for licensing issues and then block/complain because of *that*.

  13. Filippo Silver badge

    I'm a bit confused. From reading the article, it appears that the tool is attempting to deal with several issues.

    Copying code from StackOverflow without checking whether it works properly is a problem.

    Copying code from StackOverflow without checking the license is another problem entirely.

    Copying code from other bits of your project without bothering to abstract it into a function is yet another problem entirely.

    These issues actually have very little in common. Different reasons why you do it, different reasons why it may actually be valid in your case, different solutions when it isn't. I'm skeptical that the same tool can apply to all of those.

    1. bombastic bob Silver badge
      Meh

      sometimes copypasta makes sense when you're developing, but not in production. It might be nicer to simply have the tool add appropriate code comments when it finds things, like

      "// TODO: refactor, duplicate code file1.java:342 file2.jave:189"

      or something similar.

    2. diodesign (Written by Reg staff) Silver badge

      Plugin's goal

      Actually, the plugin is pretty simple: it checks to see if there is cut'n'pasted code in a file from other parts of the project (or maybe even just the same file).

      If that happens, it's generally a sign of poor programming, so it may suggest you refactor (try again). I've tweaked the headlines to reflect this.

      C.

      1. Filippo Silver badge

        Re: Plugin's goal

        Thanks, that's much clearer, and the plugin now makes much more sense!

  14. _LC_ Silver badge
    Pirate

    Can we adopt this plugin for journalism?

    This would be a wipe-out event. ;-{

  15. Anonymous Coward
    Anonymous Coward

    Oh, yeah, let’s have code written in Moscow monitor our code

    What could go wrong?

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like

Biting the hand that feeds IT © 1998–2022