back to article Boffins debunk study claiming certain languages (cough, C, PHP, JS...) lead to more buggy code than others

Tempting through it may be to believe that certain programming languages promote errors, recent research finds little if any evidence of that. A scholarly paper, "A Large Scale Study of Programming Languages and Code Quality in Github," presented at the 2014 Foundations of Software Engineering (FSE) conference, made that claim …

  1. Version 1.0 Silver badge

    It's "What's the best language" all over again

    All my life I've been asked "What's the best language to program in?" and my answer has always been, "It's the one you know the best, the one you've worked with all your life." Bugs are bugs, it's programmers who create them because they've missed some factor in the application of the language to the solution that they are coding. A bad programmer can write a bug in any language ... and what's a bad programmer? It's anyone who's writing in a language and environment that they "think" they understand but the reality is that they're just skimming the surface.

    Sure, you can blame Java, VB, APL, Assembler, etc for your bugs but you're the one who wrote the code.

    1. JohnFen

      Re: It's "What's the best language" all over again

      "you can blame Java, VB, APL, Assembler, etc for your bugs but you're the one who wrote the code."

      Yes, the old adage "it's a poor craftsman that blames his tools" continues to remain in full effect.

      1. ibmalone

        Re: It's "What's the best language" all over again

        Conversely a good craftsman doesn't use poor tools if they can help it. (Which doesn't refute the cliche, but does add another dimension.)

        1. JohnFen

          Re: It's "What's the best language" all over again

          True. But in the hands of a skilled craftsman, the difference that a good tool vs poor tool makes isn't in the quality of the product, it's in how long it took to produce, and how much swearing was involved.

          1. DCFusor

            Re: It's "What's the best language" all over again

            I always thought of it this way - a good craftsman (person?) makes a better tool so there's nothing to complain about anymore.

            For example, wasn't the notorious strcpy() replaced by the added strncpy() some time back? Which at least potentially eliminates an entire class of bug(gery). And, in that case at least, best of all - you can do a text search and find all the places the unsafe one was used.

            Just one example....

            You don't have to wait for them to make it part of the language either. Long before there was C++ for most of the Ti DSP chips, we simply defined a struct (member variables) in the header file for the C code that knew how to handle that struct...and passed pointers to it around. Duh. Self-discipline can replace the enforced kind - and works better anyway, because you're paying attention.

            Now, of course, there are "fractals of bad design" in some of the more modern "higher level" languages...still. Personally, when I use one of those because I must or because it will save me time, I don't just do unit testing - I make sure just about every single line of code does what I expected it to in a very frequent edit/(maybe compile)/test loop.

            Funny, very few bugs have come back to haunt me and some of my code has been running decades in pbx kinds of things. Now, that's not to say I never either over or under generalized something or just did a poor design, but it hasn't led to downtime.

            Just my .02 worth, but then, you get what you pay for, and this is free!

            1. Chairman of the Bored

              Re: It's "What's the best language" all over again

              @DCFusor,

              Great example with the strncpy() updates. What I have seen in my org though are people too lazy or unskilled to understand why strcpy() is an issue. So they just cargo cult code with strcpy()

              compile it, and run. They're not thinking or even bothering to read the literature in their own field.

              Obviously these need to be educated if possible and encouraged to succeed elsewhere when it is not. Any tool in the hands of a cargo cultist is dangerous, I don't think there is anything we can do with tools and architecture to mitigate incompetence.

              1. aberglas

                strNcpy is also buggy

                It does not add a null to the end if the string is exactly the same size. Which can easily lead to buffer overflows if the user is not careful.

                Never had a buffer overflow in Java. Nor .Net....

                1. Tom 7

                  Re: strNcpy is also buggy

                  Never had a buffer overflow in any language once I'd worked out it was a problem. If is beyond you to write a little routine to do the checking for you and make sure you use that rather than the potentially buggy one.

                  This stuff was only ever hard when done in assembler without macros.

                  1. DrXym

                    Re: strNcpy is also buggy

                    Lucky you. Perhaps you are in the unique position to only work on your own code and have a perfect grasp of every particular of it. Or perhaps you don't need to worry about a malicious user feeding garbage into your code which can potentially exploit a single error to great effect.

                    In the real world, code is touched by a variety of hands, some skilled, some not so skilled. Even the skilled developer can easily forget to include a check, or make an assumption that doesn't hold for some reason. And if you doubt this, I suggest you go look at CVEs for some of the most popular open source products. This software was written by knowledgable, competent people and yet it still contains bugs.

                    1. juice

                      Re: strNcpy is also buggy

                      > In the real world, code is touched by a variety of hands, some skilled, some not so skilled. Even the skilled developer can easily forget to include a check, or make an assumption that doesn't hold for some reason. And if you doubt this, I suggest you go look at CVEs for some of the most popular open source products. This software was written by knowledgable, competent people and yet it still contains bugs.

                      Yeps, this.

                      There's also another issue: contextual consistency. If I'm working on a large chunk of legacy code, and there's a frequently invoked bit of "old" code which can be replaced by some shiny "new" code, do I:

                      a) Rip out all the old instances and replace them

                      Pros: everything is new and shiny!

                      Cons: More regression testing needed. Bigger code review.

                      b) Just use the new code in the bit I'm working on

                      Pros: Quick and easy

                      Cons: inconsistent logic flows

                      c) Continue the use of the "old" logic

                      Pros: consistent logic

                      Cons: not improving anything

                      In general, I'm a fan of a), but the cost of doing so isn't always affordable

                      1. J.G.Harston Silver badge

                        Re: strNcpy is also buggy

                        I go for a modification of (a). Surround the code with #ifdefs that let you incrementally refactor the code, while still being able to go backwards when it doesn't work, until you get to the point that all the old code is #ifdef'd away and you can amputate.

                        Yes, it does mean budgeting for the time to do the refactoring instead of just the adding of new bells, but sometimes the new bells need the refactoring to be done, and quite often the time spent doing the refactoring is made up by the reduced time in needing to understand the code to add the new features.

                  2. BinkyTheMagicPaperclip Silver badge

                    Re: strNcpy is also buggy

                    There's lots of ways other than classic buffer overflows to read or write bits of memory you shouldn't.

                    What if you're sprintf or snprintf ing to a buffer, and instead of using a 'c' format character you accidentally use 's' ? That might be caught by the compiler, if you're lucky.

                    What if multiple levels of indirection are in use, and the programmer simply becomes confused? Yes, ideally in that instance you should be using a well proven library function, but people do have to learn, and it's rather a stretch to say they will 'never' have buffer overflows or pointer issues.

                    (Yes, I have had many pointer bugs when learning C, and yes if I code these days I'd prefer to at minimum use C++ for its improved memory management, class libraries, and classes)

                2. AndrueC Silver badge
                  Boffin

                  Re: strNcpy is also buggy

                  But the answer to that is to build a framework around the low-level constructs that protects you. That's why all good C++ developers use std::string or some equivalent. There was no need to use strncpy() twenty years ago when I was working in C++. In fact I had a choice of std::string, AnsiString (part of Borland's excellent VCL), CString (part of Microsoft's questionable MFC) and an internal string class we developed that enforced Unicode(*) and was available to all our projects.

                  Same thing with memory leaks - just use boost::scoped_ptr or boost::shared_ptr (or even std::auto_ptr if you must). For arrays use std::array. For lists use std::list. Stick most things on the stack and use RAII for cleanup. Toward the end of my C++ days we didn't use the new keyword much and almost never used the delete keyword.

                  In my experience minimising bugs is about an additional step when you're fixing them. The process should be:

                  1. Diagnose the fault.

                  2. Implement a fix.

                  3. Verify that the fix does indeed solve the problem.

                  4. Review the fix and ask yourself how the code could have been changed to minimise the mistake being coded the first time.

                  5. Consider adding the results of (4) to your coding standards document or development process.

                  Steps 4 and 5 are so often ignored or skipped :-/

                  (*)We wrote data recovery utilities and different file systems store strings differently so by insisting that our core library used that string class we ensured appropriate conversions and considerations.

                  1. James 47

                    Re: strNcpy is also buggy

                    > Same thing with memory leaks - just use boost::scoped_ptr or boost::shared_ptr (or even std::auto_ptr if you must).

                    You haven't written C++ for a long long time, have you?

                3. Phil O'Sophical Silver badge

                  Re: strNcpy is also buggy

                  It does not add a null to the end if the string is exactly the same size. Which can easily lead to buffer overflows if the user is not careful.

                  Use strlcpy and friends instead. It takes an argument which is the total buffer size, and guarantees no overflow, it also avoids the risk of off-by-one errors when calculating "remaining length" for the strnxxx functions.

              2. Anonymous Coward
                Anonymous Coward

                Re: They're not thinking or even bothering to read the literature in their own field.

                The product I currently work on has over 200 libraries. Read every word of every method in the library for every version that comes out? I don't know ANYBODY who does that.

                1. Chairman of the Bored

                  Re: They're not thinking or even bothering to read the literature in their own field.

                  Nor would I expect one to read all libraries' docs.

                  What I meant by my comment is that I believe I can reasonably expect someone calling themself a "computer scientist" to at least keep abreast of news and events in IT/IA/CS and actively seek to expand their knowledge l. Given events over the last 15-20 years, I believe I can expect any CS working for me to understand the fundamentals of buffer overflow and know why strcpy() is deprecated and what they need to look for in a replacement.

                  Sometimes things don't work out and that's ok - its why we have management reserve, testing, etc. When I get angry is when we repeat mistakes - ours or other's

                  Cheers, CoB

              3. Dave559 Silver badge

                Cargo cult coding

                And in the case of PHP (which is perhaps also a problem in itself, although I acknowledge that a lot of work has occurred in recent years to try to clean up some of the worst features of the language), you only have to look at how many books and web tutorials were still churning out advice about interacting with databases using the mysql_* functions (rather than the newer, more featureful, safer, mysqli_* functions or PDO) for years and years, and all too often even *after* advance warning of the deprecation of the mysql_* functions had been announced…

            2. BinkyTheMagicPaperclip Silver badge

              Re: It's "What's the best language" all over again

              The Magic Internets in multiple places says that strncpy was never a replacement for strcpy, but to truncate fixed length buffers. It's fine for its intended purpose, but poorly named as people will then try and use it as a generic C string copy function, which it is not.

              As Phil says, use strlcpy.

          2. Someone Else Silver badge
            Coffee/keyboard

            Re: It's "What's the best language" all over again

            Dammit, JohnFen! ------>

        2. Nick Kew

          Re: It's "What's the best language" all over again

          This may be a generational point. Those of us who started programming before the days of GNU/Linux and the ubiquitous PC didn't have the kind of choice we do today, we just had to use whatever was available. Can't imagine many made an active choice to start their programming careers with FORTRAN, let alone COBOL.

          1. Version 1.0 Silver badge

            Re: It's "What's the best language" all over again

            Right - and when your tools are not very good you have to really work hard to make a good product and a lot of us old folk did that ... e.g. If you need to read the keyboard then you wrote a routine to do that, the first version was almost always polled and it worked but dropped characters occasionally so you went back and wrote an interrupt handler to catch all the characters - and then you found that using the arrow keys generated more characters faster and the buffer overflowed, so you made the buffer circular and larger. And what you learned going through this process paid off later when you wrote the printer handler.

            Coders today don't go through this painful process - I think this explains a lot.

            1. Hans Neeson-Bumpsadese Silver badge

              Re: It's "What's the best language" all over again

              Coders today don't go through this painful process - I think this explains a lot.

              I find that a lot of the "coding" that I see people doing isn't really building anything.

              We need to do 'x', so hit Google/StackExchange and find a framework for it. We need to do 'y' - there'll be a library for that - have a look online...etc...

              The "coding" process is often little more than wiring together someone elses' stuff....and when something goes funny in the end-to-end implementation, debugging and fixing can be a tortuous process. I can think of some projects where I've seen the developers spending more time fighting the solution than fighting the actual problem.

              To be fair though, YMMV depending on the languages, etc. being used.

        3. Mage Silver badge
          Boffin

          Re: It's "What's the best language" all over again

          Except the company, not the programmer chooses.

          The most significant aspect is the quality of the programmer.

          Some or most of the "significantly better" languages aren't used either at all or much. Do remember Pascal (derived from Algol & Qubal) and BASIC (derived from Fortran) were only intended for teaching,

          Also the original goal of C++ in maybe 1985 wasn't C compatibility, that was sadly forced on.

          Computer Science pretty much stalled over 25 years ago. The market place, existing libraries, dominant OSes and big companies have decided the languages & tools, not programmers or Computer Scientists. Ask Verity Stob.

          1. J.G.Harston Silver badge

            Re: It's "What's the best language" all over again

            Yes, but Computer Science isn't programming. What's best for programming and what's best for teaching Computer Science are different things. believe me, I suffered three years of a Computer Science degree course wondering when we were going to actually get to any actual "computing" (by which, decades later, I realised I meant "programming").

      2. Anonymous Coward
        Anonymous Coward

        poor tools can't be blamed?....sure, sure, suurrrrre

        "it's a poor craftsman that blames his tools" - said the family accountant.

        If you've never have had to put the blame on a poor tool, you've never been skilled enough to use a better tool, as you're oblivious to how the tool should be made. People who work with their hands know this.

        Have you ever been forced to use a Chinese "steel" vice that is known to shatter at 3000psi? (Yes, 3K....think a sub-sub-SUB quality Black&Decker).

        Have you ever been forced to use COBOL?

        1. doublelayer Silver badge

          Re: poor tools can't be blamed?....sure, sure, suurrrrre

          You're allowed to complain about your tools. You're allowed to say that a tool is not fit for purpose and require a better tool to complete it. What you aren't allowed to do is *blame* them, as in "It's not my fault it fell apart immediately after you got it. You should have seen this terrible vice I had.". The tool can be worse, but that leads you to have a worse time trying to build the result, not the result simply being of less quality.

          That said, there is at least some argument for poor tools if the person concerned was made to do the job using the poor tool in the same time limit needed for the good tool. However, for programming languages, it is likely not that one is better or worse, but that one is better or worse for each specific coder and their experience.

          1. juice

            Re: poor tools can't be blamed?....sure, sure, suurrrrre

            > That said, there is at least some argument for poor tools if the person concerned was made to do the job using the poor tool in the same time limit needed for the good tool. However, for programming languages, it is likely not that one is better or worse, but that one is better or worse for each specific coder and their experience

            There is often a direct relationship between the quality of the tools and the quality of your output - and the time needed to produce the output, especially in the physical world.

            But the knowledge and experience of the person using the tools is also a factor - it's often amazing what can be improvised to address an issue ;)

            Fundamentally, it's the old saying: you can have any two of the three when it comes to quick production, cheap production and quality production...

          2. Michael Wojcik Silver badge

            Re: poor tools can't be blamed?....sure, sure, suurrrrre

            You should have seen this terrible vice I had

            Now this just makes me said. It's a great straight line, but only in American English.

        2. W.S.Gosset

          Re: poor tools can't be blamed?....sure, sure, suurrrrre

          > Have you ever been forced to use COBOL?

          per my comment below, all languages are DSLs. And COBOL was (is! surprisingly enough. COBOL.net is a real thing) great for what it was designed for: file processing. Anything else, and... eurgh.

          But that'd be like complaining that a vice designed to be sold cheap to mugginses-in-a-shed to generate business-profit, fell apart in the hands of people trying to do professional high-pressure high-temperature work ;)

          1. BinkyTheMagicPaperclip Silver badge

            Re: poor tools can't be blamed?....sure, sure, suurrrrre

            It's not even slightly surprising that COBOL.NET is a real thing, in the same way that IBM did (probably still does) COBOL that targets the JVM. That way the legacy COBOL can be encapsulated, and old systems extended using a more modern language such as C#. Re-writing code is expensive and error prone.

            1. Michael Wojcik Silver badge

              Re: poor tools can't be blamed?....sure, sure, suurrrrre

              IBM did (probably still does) COBOL that targets the JVM

              As do Micro Focus. In fact, most traditional COBOL can be compiled to native, CLR (.NET), or JVM targets; and OO COBOL source that sticks to the set of "common managed language constructs" can be compiled as CLR or JVM. It's just a compiler switch.

              Of course, source that depends on things only available in one environment - mostly what's in the .NET Framework and Java Standard Library, but also a few CLR constructs can't be easily implemented in JVM, if memory serves - can't just be built for the other environment. But if you factor your business logic into one set of classes and all your interfaces to other layers into another, you can pretty easily port a lot of COBOL code between the two.

              And modern simplified-syntax managed OO COBOL is a lot more pleasant than traditional procedural COBOL, by the way. So often what we see are people leaving lots of legacy COBOL untouched and just recompiling it, but wrapping it in modern COBOL to provide the interfaces with their existing other systems.

          2. Someone Else Silver badge

            @W.S.Gosset -- Re: poor tools can't be blamed?....sure, sure, suurrrrre

            And COBOL was (is! surprisingly enough. COBOL.net is a real thing) great for what it was designed for: file processing. Anything else, and... eurgh.

            Back in the day, I knew someone who wrote a printer driver in COBOL, because...well, when all you have is a hammer, everything looks like a nail. It worked...ish. But as you say...eurgh!

          3. cray74

            Re: poor tools can't be blamed?....sure, sure, suurrrrre

            Going off on a materials science tangent...

            > But that'd be like complaining that a vice designed to be sold cheap to mugginses-in-a-shed to generate business-profit, fell apart in the hands of people trying to do professional high-pressure high-temperature work

            If the prior poster's claim that the vise had an ultimate tensile strength of 3,000psi is correct then the vise would fail in common, everyday use, not "high pressure, high temperature work."

            3,000psi is the realm of unreinforced plastics. The softest aluminum alloys barely drop below 10,000psi (10ksi); 30-45ksi is typical and over 75ksi possible. Cheap, mild steels and cast irons you'll encounter daily in cars, steel cans, and fence posts should be 35ksi - 60ksi, while alloy steels like 4340 can nose past 200ksi and ubersteels like Aermet 340 perform above 300ksi. Not that titanium (Hollywood's darling) would show up in a vise, but common titanium alloys are in the 140-160ksi range, with some 21st century exotics broaching 200ksi.

            Point being: 3ksi from a vise's steel ain't right. Some foundry screwed up in an extra special way to make that metal.

        3. big_D Silver badge

          Re: poor tools can't be blamed?....sure, sure, suurrrrre

          COBOL was great!

        4. A.P. Veening Silver badge

          Re: poor tools can't be blamed?....sure, sure, suurrrrre

          "Have you ever been forced to use COBOL?"

          As a fully trained, qualified and experienced COBOL programmer I find this remark insulting.

          COBOL is a fantastic programming language for its purposes, mainly business administration. It is not a good programming language for low level system programming. For that I recommend (and use) assembly. Just use the right tool for the job. If you don't have the right tool for the job (or are not qualified to use it), you have a completely different problem than if you have poor tools.

          1. big_D Silver badge

            Re: poor tools can't be blamed?....sure, sure, suurrrrre

            And with the later relaxation of line width and come relaxation on column importance, it became a very usable language.

            I started on PRIMEOS and later moved to DEC VMS COBOL and MicroFocus COBOL on DOS.

            It is a lovely, verbose language, which is easy to read and understand.

            I loved the feeling of achievement, when you finally got all the code typed in. Pages and pages of it!

            I've programmed on most modern languages since then, but I still have fond memories of COBOL.

            Currently using C#, Python, PERL and PowerShell.

          2. joeldillon

            Re: poor tools can't be blamed?....sure, sure, suurrrrre

            Assembly for systems programming?! (outside of the very start of a bootloader or OS kernel, of course). How is life back in 1966?

          3. Admiral Grace Hopper

            Re: poor tools can't be blamed?....sure, sure, suurrrrre

            I agree.

          4. Hi Wreck

            Re: poor tools can't be blamed?....sure, sure, suurrrrre

            Assembly? Surely you jest. Today's compilers will do a far better job of scheduling instructions than all but the most experienced assembly programmers, and when the architecture of the machine changes (including our beloved Intel x86), it takes a lot less time to recompile than to recode. Put a steak into it.

            1. big_D Silver badge

              Re: poor tools can't be blamed?....sure, sure, suurrrrre

              But the frameworks etc. make for huge code.

              Just look at Inspectr from grc.com. It has a GUI and uses just over 100KB for the executable, the programmer complained that a majority of that was taken up by the icon!

              Now try and write something in C++ or C# with a simple GUI form and probes the hardware and comes in with a fingerprint under 200KB.

              1. Ken Moorhouse Silver badge

                Re: Just look at Inspectr(e) from grc.com

                I don't think that anything Steve Gibson produces is comparable with anything that is in the mainstream. (That's an accolade BTW).

        5. J.G.Harston Silver badge

          Re: poor tools can't be blamed?....sure, sure, suurrrrre

          I once bought a claw hammer that bent in two the first time I used it. Good for them, though, when I took it back to the shop after looking at it in amazement they gave me my money back and I bought a proper hickory-handled one.

    2. Lee D Silver badge

      Re: It's "What's the best language" all over again

      Pretty much like asking what the best language is.

      I'm sure you can argue Shakespeare vs Dante vs Aristocles vs... to the end of the earth. What matters is not what language they expressed it in, but what was expressed and that it was expressed fluently.

      I work in schools and I program in my spare time, so what with the focus on "every kid coding" (when clearly every kid can't even play a musical instrument, let alone code), it's the same question I get all the time.

      The answer? I really don't care. So long as I can understand it, and show you where you've gone wrong, that's infinitely more important - that you find a language that you find easier / more complete / are able to source examples for / whatever I don't care.

      Fact is, I've never programmed in Python in my life. A colleague gave me a teenager's Python code without telling me anything. In LITERALLY one glance, I spotted every kind of problem with the code that showed it was an amateur programmer, corrected them and was able to run my modified version of their program about 30 seconds after getting a Python interpreter working while simultaneously running their code and demonstrating bugs theirs had that mine didn't.

      It's about fluency and expressiveness, not what language. I'm sure Chinese has more language subtlety, that Latin-based languages are easier to learn, that English is understood in more countries. But if you are going to write a ground-breaking novel, the language really doesn't matter as much as the content.

      1. W.S.Gosset

        Re: It's "What's the best language" all over again

        Yup. It's why I roll my eyes at "teach everyone to code!" initiatives -- if you ever HAVE taught coding, you'll be stunned at just how large a proportion of the population simply can not think through a problem let alone describe a solution. Including a DISTURBING number of "professional" "programmers".

        (And most lawyers -- contracts are just programs written in English.)

        1. Caver_Dave Silver badge
          Boffin

          Re: It's "What's the best language" all over again

          Hence why I always gave interviewees a pencil and paper and asked them to write about their journey to the interview. Many MSc CompSci graduates could not put together a coherent description, or use the basic constructs of the English language correctly.

          I always said that if they could put together a decent description in the English language, then I could help them program proficiently in any computer language. I was made to stop by my managers as they thought it was degrading to the interviewees!

          1. AndrueC Silver badge
            Boffin

            Re: It's "What's the best language" all over again

            I always said that if they could put together a decent description in the English language, then I could help them program proficiently in any computer language.

            I've always said that well written code should read like human language. By that I mean using descriptive meaningful identifiers. Instead of

            var i=ds.Write(data);

            if(i==0) return false;

            Which is only vaguely meaningful without comments, do:

            var numberOfBytesWritten=dataStore.Write(dataToWrite);

            if(numberOfBytesWritten==0) return DataStoreResult.WriteFailure;

            Although throwing an exception would probably be a better way to handle it.

            I dislike seeing comments in code except as headers on methods saying what they do and offering information about parameters, especially if you're using an IDE that pull that information out and show it as a hint. Comments within code suggest code that isn't clearly written and the big problem with comments is that the compiler doesn't verify them. Out of date comments can be worse than no comments at all.

            That doesn't mean I never use them but they are used sparingly and typically to provide intent or usage information rather than to actually say what the code is doing.

            1. Version 1.0 Silver badge

              Re: It's "What's the best language" all over again

              These days I agree, but when you see minimal code and verbose comments then it's a sure sign that they started writing Assembler - I did.

              1. Jaybus

                Re: It's "What's the best language" all over again

                As did I, but I will continue to do so, as the verbose comments are still far better at describing what the code is doing and far faster to read.

            2. Anonymous Coward
              Anonymous Coward

              Re: It's "What's the best language" all over again

              Comments within code suggest code that isn't clearly written

              I disagree. Yes, simple code and breaking things down into simpler and simpler sub-functions makes the code itself simple to understand, but if there is some domain knowledge required in order to understand why an algorithm is the way it is then no amount of simplification is going to help.

              One way I get around this is to put the description of the algorithm in comments at the start of the function/procedure definition and then refer to the sub-steps at the right points in the code.

              E.g:

              /*

              Function calculateFoo() uses the XYZ algorithm to determine appropriate responses for widget_x and widget_y controlled devices; as follows:

              1) Identify the widget from the repository; use old repository for pre-1976 widgets

              2) description of second step

              3) third step

              etc

              */

              and then in the body of the code...

              /* Step 1 */

              ...

              ...

              /* Step 2 */

              ...

              etc

              That way the explanation is all in one place but it is easy to see which bit of code implements which bit of the algorithm. It's not perfect, nothing is, but better than no comments at all.

              and the big problem with comments is that the compiler doesn't verify them.

              That's kinda why they're called comments. :-)

              Out of date comments can be worse than no comments at all.

              That's not a fault of comments, that's a fault of lazy programmers. (Or programmers not fully understanding the impact of the change they're making if they've left in something that's no longer required..)

              1. AndrueC Silver badge
                Meh

                Re: It's "What's the best language" all over again

                I have no particular problem with the example you posted, with one caveat, all you're doing is describing the intent of the function then the intent of the code and that's entirely permissible. The only caveat I have would be that rather than using comments to demarcate the steps break them out into methods or functions with an appropriate name. Now on some platforms the cost of function/method calls can be an issue but outside of embedded programming it's rarely an issue.

                So for your example I'd prefer to see:

                ObtainWidgetFromAppropriateRepository();

                PerformSecondStep(); // Obviously these would use a better description.

                PerformThirdStep();

                ..and now there's no need for comments. Even better if it turns out that other code needs to get a widget you now have a stand-alone (or close to it) method to call instead of having to copy code blocks.

                That's not a fault of comments, that's a fault of lazy programmers.

                Of course, but sadly the world is full of lazy programmers. If you expect everyone that looks after your code to do so in a conscientious and thoughtful manner simply of their own volition you're setting yourself up for failure. You'll either never find enough suitable programmers to meet your needs or else you'll be constantly falling foul of issues because you failed to anticipate the level or incompetence.

                1. Anonymous Coward
                  Anonymous Coward

                  Re: It's "What's the best language" all over again

                  > So for your example I'd prefer to see:

                  > ObtainWidgetFromAppropriateRepository();

                  > PerformSecondStep(); // Obviously these would use a better description.

                  > PerformThirdStep();

                  The only problem I have with this is when the code doesn't split cleanly between the functions. Often something in step 1 sets up something else ready for step 2. And then you have to decide whether to pass it as a parameter or do something else. From a purist computer science theory point of view it's a parameter and should be passed as such. However, returning a minor value from one sub-function, to pass on as a parameter to the next sub-function gives it an exaggerated importance (because it's highly visible in the listing).

                  As always, YMMV, and rules are better if they can be broken occasionally.

              2. Someone Else Silver badge

                @2+2=5 -- Re: It's "What's the best language" all over again

                [...] but if there is some domain knowledge required in order to understand why an algorithm is the way it is then no amount of simplification is going to help. [Emphasis added]

                Yes! This!! Naturally, using well-named variables should be automatic to any proper practitioner of the craft1, but variable names often cannot (and more often do not) convey the why of the program. Why did you choose to use this algorithm, or use this library, or use this odd, outside-the-lines coding technique? Cleverly-named variables won't convey that kind of information.

                1Yet ,we all know how likely it is to find these things in your average program...

              3. This post has been deleted by its author

            3. AndrueC Silver badge
              Meh

              Re: It's "What's the best language" all over again

              Perhaps I should clarify what I meant here:

              That doesn't mean I never use them but they are used sparingly and typically to provide intent or usage information rather than to actually say what the code is doing.

              What I mean is that I dislike seeing this:

              // Process all the items in the list.

              foreach(var itemToProcess in listOfItemsToProcess) ProcessAnItem(itemToProcess);

              or the utterly unforgiveable:

              // Increment i

              ++i;

              However the following is acceptable:

              // Now that we've got our list of items we need to process them so that when we return the list

              // to the caller they are ready to be used.

              foreach(var itemToProcess in listOfItemsToProcess) ProcessAnItem(itemToProcess);

              In this case that's probably exactly what I'd write however for a more complex code block it would probably be written as:

              ProcessItemsToGetThemReadyForUse(listOfItemsToProcess);

              I don't hate comments. I only hate comments that tell you things that are obvious from the code - often they are totally unnecessary and the more you describe code the more likely they are to be out of date. But giving an overview of an algorithm or saying whereabouts you are in that algorithm is absolutely fine.

            4. JohnFen

              Re: It's "What's the best language" all over again

              "I dislike seeing comments in code except as headers on methods saying what they do and offering information about parameters"

              Me too -- those sorts of comments are harmful. But I disagree with you about comments in code being a red flag of some sort.

              A bad comment is one that tells you what's happening (the code already tells you that). A good comment is one that tells you why it's happening.

          2. Version 1.0 Silver badge

            Re: It's "What's the best language" all over again

            Years ago when I was hiring programmers I always but the ones with Latin in their language skills at the top of the pile. If you have been taught Latin at school then you have your foot in the door for programming - weird I know, but it works.

            1. disgruntled yank

              Re: It's "What's the best language" all over again

              Lots of projects in Perligata (https://metacpan.org/pod/distribution/Lingua-Romana-Perligata/lib/Lingua/Romana/Perligata.pm), then?

        2. ibmalone

          Re: It's "What's the best language" all over again

          I'm not a professional programmer (I think), but this is why the use of "coding" or "code" rather than "programming" or "program" always troubles me. I suspect it is meant to make things seem more accessible, but it also applies the skill, and the thing that needs to be taught, is in writing the funny symbols (in Matrix-style green on black), rather than being able to draw out a flow chart or describe what it is you're attempting to do.

          (And most lawyers -- contracts are just programs written in English.)

          To be fair to the lawyers, they are using heavily overloaded codewords with documentation largely made up of previously observed behaviour on a language implementation that keeps changing.

          1. JohnFen

            Re: It's "What's the best language" all over again

            "this is why the use of "coding" or "code" rather than "programming" or "program" always troubles me"

            In my opinion, "coding" is a specific subset of the larger task of "programming". The act of programming includes design (when you aren't actually writing code). Coding is what you're doing when you are actually typing computer instructions.

            1. ibmalone

              Re: It's "What's the best language" all over again

              Exactly, but the push now is for the "coding", and I'm not sure the policymakers or pundits on discussion panels understand the difference. The design process, or at least understanding what you want to do in a logical way, is the more widely applicable skill, beyond just the domain of programming even. I'm sure everyone has at some point seen beginner's code which consists of cobbled together bits of syntax that may or may not run, but definitely doesn't do what was intended because the actual operation wasn't considered. (Heck, I wrote stuff like that when I first learnt C, from what may have been the worst C book ever.)

        3. AndrueC Silver badge
          Unhappy

          Re: It's "What's the best language" all over again

          I worry about some professional programmers as well. We had a contractor working in C# and he wrote a method that added three or four IDisposable classes to a list then at the end of the function iterated over the list and called Dispose(). It was because he didn't want to rearrange the order of construction and use nested Usings(). It only took me two minutes to rearrange the code and get rid of the list.

          I just don't understand how anyone being paid big bucks to write C# code could proceed down that route in the first place.

          1. Someone Else Silver badge

            Re: It's "What's the best language" all over again

            I just don't understand how anyone writing C# code would be paid big bucks....

        4. Phil O'Sophical Silver badge
          Coat

          Re: It's "What's the best language" all over again

          contracts are just programs written in English.

          or Latin, which at least has the advantage that ANSI doesn't bring out a new incompatible version every few years :)

        5. RaceQ

          Re: It's "What's the best language" all over again

          > "teach everyone to code!"

          I hope that means all the kids exposed to it and who do not like it realise it is not your destiny to become a programmer, leaving all those who truly like it to joins the industry. I knew I wanted to become a programmer nobody forced me :-) they probably should have forced me to learn other life skills in school - lol.

      2. W.S.Gosset

        Re: It's about fluency and expressiveness, not what language.

        seen recently (here?):

        "Finally, a note of caution. This language, like English, can be a medium for prose, or a medium for poetry. The difference between prose and poetry is not that different languages are used, but that the same language is used, differently."

        -- Kristian Beckers, "Initiating a Pattern Language for Context-Patterns"

      3. Someone Else Silver badge
        Coat

        @Lee D -- Re: It's "What's the best language" all over again

        I'm sure you can argue Shakespeare vs Dante vs Aristocles vs... to the end of the earth.

        I've been in the business a long time, but I've never heard of any of those languages. Do they run on Linux?

    3. big_D Silver badge

      Re: It's "What's the best language" all over again

      A "good" language doesn't make a good programmer and a good programmer can, probably, write good code in a "poor" language.

      That said, typed and compiled langauges should lead to fewer oversights than an untyped, interpreted langauge. But, again, it isn't a guarantee of no/fewer bugs, it should just help eliminate one source of errors, but there are enough other areas still open, E.g. buffer overflows, programmer error etc.

      But good design and good testing are still the biggest differentiator.

      1. Tom 7

        Re: It's "What's the best language" all over again

        I used to code as a tool - I'd write stuff that made my chip-design easier and less error prone. The best part of my job then was computers were slow and even my hyper efficient jobs used to take a few days to run let me go and read all the journals in our library and, around '87 I came across a paper where the author looked at C and wrote simple code to check for buffer overflows, unreleased memory and a whole bunch of other testable errors and I swept these up and added a stage to make which was run before make debug so by the time my code was make debugged 95% of the invisible errors were already eradicated. Just making sure strcpy and malloc were not used in any code you test unless by sandbox proxies let alone consider for production makes your life so much easier (well once you've stopped using MS programs that need to live in 64k segments).

        I've done this in any language I've had to code in for more than a few weeks as the couple of weeks spent writing the belts and braces pays for itself pretty quickly - even if the morons who run the project wont allow you the time its worth sneaking it in under the radar.

        1. big_D Silver badge

          Re: It's "What's the best language" all over again

          Exactly.

          Most good programmers develop very quickly a library of routines / code snippets that do away with simple errors and get automatically incorporated in new code.

          1. Nick Kew

            Re: It's "What's the best language" all over again

            Well, I've developed a lot of reusable automation for everyday tasks.

            But a personal library of code snippets? Something I prefer to avoid, at least beyond a very limited point. Better to find some tried-and-tested library than to go around reinventing the wheel.

            Then if necessary I can add to that and contribute back according to my own needs - and benefit from the open source model.

            1. AndrueC Silver badge
              Happy

              Re: It's "What's the best language" all over again

              But a personal library of code snippets? Something I prefer to avoid, at least beyond a very limited point. Better to find some tried-and-tested library than to go around reinventing the wheel.

              Like all things, moderation is key. Using someone else' work saves time and often reduces bugs. However it can also hide implementation details and it's not good if the developers of an application don't know what some bits of it do. It can also lead to poor design if someone bends their code to use a third party library. Then there are all the niggling dependencies that can trip you up if you want to update. Has anyone coined the term 'Nuget hell' yet?

              Of course back when I was a young programmer in the 80s and 90s using third party code was even more risky. You rarely got the source code, the developer didn't have any provision for accepting your revisions and talking to them meant snail mail or a telephone call. Github, Stackoverflow and their ilk have been a boon to software development.

            2. big_D Silver badge

              Re: It's "What's the best language" all over again

              I was refering to 1 or 2 liners that you repeatedly use.

              For example string validation, length checking etc. when setting properties.

              These are things that should become second nature, but you have a little file somewhere with all the examples, so you can quickly access them.

              1. Lee D Silver badge

                Re: It's "What's the best language" all over again

                Such snippets can be small things.

                I keep a file that loads dynamic libraries on various OS (i.e. LoadLibrary/GetProcAddress or dlopen/dlsym). It would be massive overkill to rely on some centralised library to do so, for what is half-a-dozen lines.

                But equally having to prototype functions / load libraries / get function pointers on both Unix and Windows systems can take more time than necessary for it to work and correctly fall over when something's wrong.

                I have a couple of crafted macros where you can literally build the function prototype just from a simple substitution of the function definition (e.g. copy/pasted from a library's API or similar) which then prefixes the function name so you know it's dynamically loaded (rather than whatever the linker might pick up of the original function from a static inclusion), creates a prototype, puts in a function definition for you, loads the library, checks it loaded, has all the functions check that the library loaded and/or had that function inside it, etc.

                The number of "professional" programs I deal with where just switching out a DLL or having the wrong version result in the propagation of a NULL pointer back until it's dereferenced when that function is first used - which may be deep within usage of the program - with no checking, just assuming that the DLL will always be there, always be the right version, not pick up system versions, etc. DLL Hell was rightly named and entirely the result of poor programming. And now there are attacks that revolve around just sticking a DLL in a program folder with the right name and poor programs will try to blindly use them in preference to the actual system DLLs, which generates all kinds of security nightmares.

                Having a copy/paste works, but I wouldn't rely on any simplified library loading system to do it right, and it's not worth including other's code just for that, but similarly not worth having to rewrite it each time.

                Same way that I keep a handful of Exchange Powerscript lines in a text file on the Exchange Server. Nothing I couldn't Google in ten minutes and get working, but I don't need a specific library for it, it's easier to copy/paste them, and I can put my own safety barriers in that example code from the MS KB often doesn't (e.g. -WhatIf, piping to make sure that only one OU is affected etc.)

          2. JohnFen

            Re: It's "What's the best language" all over again

            Early in my career, a mentor of mine once opined that half of the value of an experienced software engineer lies in the collection of proven, stock routines and idioms that they have accumulated.

            1. Nick Kew

              Re: It's "What's the best language" all over again

              Early in my career that would've made a lot more sense than it does today.

        2. Mage Silver badge

          Re: It's "What's the best language" all over again

          I found the big issue with C was libraries. Both vendor supplied and ones written in house.

          I experimented with testing them, also defensive code that was conditional for development version that checked ranges & sizes of data passed & returned was at least sane.

          I think 90% of the learning curve of a new language on a real existing platform is learning libraries, including the "avoid completely" and "dangerous" functions.

          Calling functions in a Windows DLL was a special sort of hell if not well documented and you didn't understand what your source language might really pass. You'd quickly learn a VB6 string needed to be assigned to something suitable and wasn't like a Pascal or C string in memory structure at all.

          Ultimately choosing a good programmer is more important than the language choice. A good grade on a computer course, degree, masters or PhD means nothing.

          Probation for 3 months with no appeal.

    4. Anonymous Coward
      Anonymous Coward

      Re: It's "What's the best language" all over again

      And now you can give a proper answer: Python.

  2. Anonymous Coward
    Facepalm

    And they get paid money to do this?

    There are so many confounding factors here that even attempting such an analysis is pointless to start with. I'm sure the various teams of researchers are fully aware of this.

    Research something useful instead!

    1. Mark 85

      Re: And they get paid money to do this?

      In this case, the second set of academicians' research was needed to refute the first and point out what you have posted. Are they aware? Who knows? Seems any PhD candidate will engage in research no matter how aware of things they are.

      Even though we IT types know that this research is flawed (the first), we have no standing unless there's plenty of PhD's involved and some reseach grant money.

      1. DCFusor

        Re: And they get paid money to do this?

        Yep, I almost needed a new keyboard for that reg article the other day that suggested we devs do some "bridge building" with academics for bug hunting and so on. It's not us who don't listen....

        They like to pretend that they're special and know it all because formal education that was out of date before they began, and we unwashed can't possibly have anything to add.

        In the US, FWIW, you can't even get grant money to study anything or publish your findings in a "real" journal - even if you're a recognized expert in the industry - if you don't have a PhD. There's a market to hire "in name only, no need to show up" PhD's to be PI's for government contracts that are instead actually done by people who can do stuff and make it work, not just pontificate about it.

        It's funny they'll study anything they can get grant money for, but then look down on us who work at...anything we can get money for - or maybe are actually a bit more selective than they....

        1. Richocet

          Re: And they get paid money to do this?

          I'll concede your point about people who can get things done, but having worked alongside academics with PhD's for many years, I think this is more a case of "You don't know what you don't know".

          It is amazing what some of them know after having spent multiple decades specialising in one area.

          I'm not a PhD myself - I'm an engineer. I made a similar comment in response to a Reg Thread about what do "registered engineers know that developers who with the job title Engineer don't".

          1. Phil O'Sophical Silver badge

            Re: PhDs

            Dilbert has it right

        2. Caver_Dave Silver badge

          Re: Snotty PhDs

          I was chatting with a PhD a while ago and mentioned that my daughter was a Doctor. He asked what subject, which I though was slightly strange. I answered Medicine. He wrinkled up his face in disgust and said "Oh, a degree in Medicine, not a proper doctor!" I certainly know which one I would like to treat me when I am ill!

          1. Anonymous Coward
            Anonymous Coward

            Re: Snotty PhDs

            "Oh, a degree in Medicine, not a proper doctor!"

            It's a silly response, but I can see where they're coming from, as PhDs generally take longer to attain, and are a postgraduate qualification (whereas an MD is normally done as an undergraduate) - PhD = BSc/BA (3 years) + PhD (4 years) for a total of 7 years, whereas MD = BSc (3 years) + BM4 (4 years), or BM5 (5 years), or BM6 (6 years) for a total of 5, 6, or 7 years (all figures are based on full-time study, when I worked in a medical school).

          2. Chairman of the Bored

            Re: Snotty PhDs

            I've seen this at close range for years, but in the other direction as well: stepdad is a PhD physicist and my mom worked for surgeons. In formal social settings I notice the MDs will refer to my stepdad as "doctor" but surgeons almost never do. MDs or surgeons who also have PhD will refer to anyone else with PhD as "doctor". So ... surgeons.

            That being said, I want my surgeons to be highly confident people (but hopefully not cocky) and I suppose some ego is to be expected.

            Me? I will answer to damn near anything, especially if you are offering a pint. I will gladly offer an MD their title of "doctor" out of sincere respect for their skills and responsibilities.

            Yoda say, "Petty BS this is"

            1. Kubla Cant

              Re: Snotty PhDs

              MDs will refer to my stepdad as "doctor" but surgeons almost never do

              I imagine it's a mark of respect on the part of the surgeons. Long ago, surgeons were barbers and bonesetters, and considered inferior to doctors, so they were addressed as "Mister". Now that surgeons are doctors with extra qualifications, they take pride in being called "Mister".

            2. JohnFen

              Re: Snotty PhDs

              I've known a LOT of PhDs, but only a small number of them would put up with anyone calling them "doctor" outside of certain professional or ceremonial situations. I would never use the term "doctor" (or any other honorific) for them (or an MD) in social situations.

          3. JohnFen

            Re: Snotty PhDs

            His response was ridiculous, but his question was not. There are lots of people with doctorates, but most of them aren't MDs

            1. Anonymous Coward
              Anonymous Coward

              Re: Snotty PhDs

              > His response was ridiculous, but his question was not. There are lots of people with doctorates, but most of them aren't MDs

              I know someone with a PhD who holds exactly that same view. His justification is that someone who gains a PhD must have done some original work, whereas an MD just learns what is already known.

              It's hardly a distinction worth trying to educate the general populace in, though. (So what am I doing now? Drat. Delete. Delete.)

              1. JohnFen

                Re: Snotty PhDs

                "His justification is that someone who gains a PhD must have done some original work, whereas an MD just learns what is already known."

                I'm not so sure he's correct here, but it may depend on the institution awarding the MD. In my quick survey of the stated requirements from a number of medical schools, they have all stated something like this (taken from Cornell's program):

                "It is expected that MD-PhD students have submitted original research articles of which they are first author by the time they defend their thesis. It is advised that all research articles relating to the thesis research be submitted before the students begin their clinical training."

          4. Anonymous Coward
            Anonymous Coward

            Re: Snotty PhDs

            I have a huge amount of respect for those bright enough and with the perseverance to receive either a PhD or MD, but for either to feel that they've earned a special place in society and so should be referred as Doctor in social settings speaks more to their ego than anything else. In professional settings it certainly makes sense under many, most(?), circumstances.

      2. JohnFen

        Re: And they get paid money to do this?

        Also there's something that academics are well aware of but the general public too often isn't: a single study means little. It only becomes significant after other independent researchers replicate the results. It may sound wasteful, but it's an important protection against error.

        1. Nick Kew

          Re: And they get paid money to do this?

          Journos must bear a lot of the guilt for that.

          This particular Reg article was unusually good: it did spell out clearly that there were lots of caveats, and even that the researchers were well aware of this.

          1. JohnFen

            Re: And they get paid money to do this?

            "Journos must bear a lot of the guilt for that."

            Indeed they are. In fact, it's so bad that every researcher I've worked for has considered them harmful and avoided talking to them as far as possible, because they usually report studies in a way that seriously misrepresents the study, thus actively misleading the public.

            You can see it for yourself pretty much every day. The next time that you see an article about "Scientists prove X", take the time to actually read the paper that the article is talking about. There's a 90% chance that the paper doesn't say what the article says it does. And there's a shockingly high chance that it says the opposite.

    2. Anonymous Coward
      Anonymous Coward

      Re: And they get paid money to do this?

      yes!

  3. John Smith 19 Gold badge
    Unhappy

    One more time.....

    Correlation is not causation.

    And if you can't even duplicate the results....

    As an old IT proverb has it "You can write FORTRAN in any language."

    1. Version 1.0 Silver badge
      Happy

      Re: One more time.....

      It's A good programmer can write FORTRAN programs in any language - I used to put that in the header of all my PASCAL programs back in the 70's

      1. Anonymous Coward
        Devil

        Re: One more time.....

        > A good programmer can write FORTRAN programs in any language

        I am not sure that was meant as a compliment.

        1. Tom 7

          Re: One more time.....

          I think you've missed the point of the article!

        2. Version 1.0 Silver badge

          Re: One more time.....

          I think you've missed the point of the article!

          Sure, it's not a compliment but back in the 70's we could say these things and laugh at ourselves.

  4. JohnFen

    Too simplistic

    This seems too simplistic of a premise to me. Different languages have different strengths and weaknesses, and as a result are best suited for different sorts of tasks -- that, at the heart of it, is why discussions about "good" and "bad" languages are suspect right up front -- a given language can be the best choice for one sort of task and the worst for another.

    If you're using a language for a task that it isn't well-suited for, you're going to have to write more (and more complex) code in order to make it work. I would expect that doing this would result in a higher defect rate. Using the same language for a task that it's designed for means that you'll have to write less (and less complex) code, which I would expect to result in a lower defect rate.

    A study that just tallies defect rates with language use, but fails to take into account whether or not the language was suited for the task, means little. I would expect that sort of study would average out the defect rates and result in more-or-less the same "quality" score across the board.

    Which appears to be the actual result in this study.

    1. W.S.Gosset

      Re: Too simplistic

      Every language* is a DSL.

      Except Lisp. And the first thing everyone does in Lisp, is write THIS problem's DSL.

      .

      * (+ framework, nowadays)

    2. thames

      Re: Too simplistic

      Much of what determines what a programming language is selected for an application has to do with economics, and that in turn has to do with how it is being deployed and managed as well as what libraries are available to implement it.

      I've just had a glance at the report, and while I have not read the whole thing, it's pretty clear the authors have pointed out some gaping holes in the original study. One obvious problem is that actual programming languages don't fit into the neat pigeon holes that the original study assumed. For example, functional languages may have procedural features and visa-versa, so users will commonly combine the two methods in the same project, making any conclusions based on a neat separation of the methodologies void. There are more examples, including the fact that large chunks of data the original study was based on was nowhere to be found.

      The studies themselves were based on comparing the number of bug fix Github commits to the number of total commits. In other words, what proportion of commits were bug fixes. A "bug" was determined by scanning for the words "bug", "error", etc. in the commit message.

      One pretty obvious problem that neither of the authors appear to have addressed is the relative maturity of projects compared to languages. Projects that have been around for a long time and are relatively stable are likely to be written in languages that have been popular for a long time, and it is possible to hypothesise that commits are more likely to be focused on bug and security fixes rather than new features. On the other hand, projects that are more recent have a greater probability of being written in languages that have become popular only more recently, and to be in a phase in which they are receiving more feature commits rather than bug or security commits. And if we examine how the languages cluster in the study, they seem to fit that pattern. Of course correlation doesn't equate to causation (as the second study points out), but it is yet another possible explanation which should be considered.

      And as the study itself points out, there are many "bug" commits which are just fixing cosmetic, style, or comment issues rather than fixing actual functional bugs. The methodology is not able to distinguish these despite some "communities" being more obsessed about these issues than others.

      What is noticeable is that most of the well established and popular programming languages appear to cluster pretty closely together while the "boutique" languages are in another cluster. It is not obvious to me how this could be explained simply by language feature rather than being a reflection of who is using them and how.

      There are two interesting outliers however, these are C and Perl. C supposedly has an abnormally high number of bugs, while Perl has an abnormally low number of bugs (see figure 5). If we want to make any language superiority claims based on the study, then evidently if we want bug free software we should be writing it all in Perl. Or perhaps it's all just bollocks after all.

      1. Yet Another Anonymous coward Silver badge

        Re: Too simplistic

        These reports also generally quote bugs / line of code, or bugs / feature point.

        On the project I work on the Matlab prototype is one function call, the only possible bug in my code is to get the order of the matrix rows/cols the wrong way round (fscking Matlab)

        The python version is 3 lines of code - I need to create some input arrays

        The production version is a few 100 lines of highly optomised C++ calling 100s more lines of complex CUDA kernel. It took a year to build. Although it does run at 60fps instead of 1 frame/cup of coffee

        The bugs/line of code might be the same - that doesn't make the language choice irrelevant

  5. TRT Silver badge

    Perhaps...

    There's a correlation between rebuttals of papers and the statistical software package used by the authors? Are some statistical packages more error prone than others?

    1. Caver_Dave Silver badge

      Re: Perhaps...

      In Certifiable systems you have to prove the quality of your tools as well as your code.

    2. Anonymous Coward
      Anonymous Coward

      Re: Perhaps...

      You're not seriously suggesting that the choice of software by scientists could introduce a significant degree of inaccuracy into their work are you?

      https://www.bbc.co.uk/news/technology-37176926

      1. Anonymous Coward
        1. ibmalone

          Re: Perhaps...

          From one of the original authors of that paper: https://blogs.warwick.ac.uk/nichols/entry/errata_for_cluster/.

          Aside from the 40,000 studies headline rate being a serious overstatement, the real problem they were highlighting was not actually the bug they discovered (just happened to run into that along the way), it was that the statistical techniques used had inappropriate default parameters (cluster forming thresholds) in some of the major packages, leading to violation of the assumptions behind the tests. A more sober estimate of the number of potentially affected studies (from one of the authors, who were a bit taken aback by the widespread and inaccurate coverage of the paper), was about 3,500 of the 40,000, with 13,000 having the more serious problem that none of the corrections implemented in those packages were used. The problem largely disappears if the correct thresholds are used, though they also suggest non-parametric testing as an alternative.

  6. Numpty Muppet

    Enjoyed FORTRAN more than any language...

    but ultimately bugs come (largely) from lack of understanding of subject matter.

    1. W.S.Gosset

      Re: Enjoyed FORTRAN more than any language...

      including the language

    2. Nick Kew

      Re: Enjoyed FORTRAN more than any language...

      Bugs come from many directions.

      Bugs in C++ might perhaps come from the complexity of the language itself. Not so much Stroustrup's original C++ as-was 30 years ago, but the designed-by-committee monstrosity it grew into.

      Bugs in a complex formally-verifiable system I had the misfortune to work on sometime in the '80s came from the complexity of the test framework, and the pressure that put on programmers to get it through the tests rather than get it right.

      1. Yet Another Anonymous coward Silver badge

        Re: Enjoyed FORTRAN more than any language...

        Now bugs occur in some random bit of javascript called by a framework, which imports a library which uses a framework which ... turtles .. turtles ....

        At least with Fortran you could be sure that the NAG library was correct

  7. simpfeld

    Rust

    I wonder how rust would do as the claim is it removes memory bounds issues.

    Always wondered if this would just move the issues, a study like this might show if there is value to this approach. And if the rust OS redox is a sensible way for us to all go.

    1. DCFusor

      Re: Rust

      I'd think it would be worse. Because there are lot more kinds of bugs than the ones Rust (or any language) can protect against. And a false sense of security is, well, false.

      I could list more classes of bugs than will easily fit here that Rust has no clue about; and I'm not really against Rust - I'm against depending on some magic "one weird trick" language to solve it all.

      Locks need to evolve because lock pickers and bypassers keep inventing new ways to get past the existing ones. If nothing else.

      1. DrXym

        Re: Rust

        "I'd think it would be worse. Because there are lot more kinds of bugs than the ones Rust (or any language) can protect against. And a false sense of security is, well, false."

        Here are some bugs you CANNOT write in safe Rust:

        * Null pointer exceptions

        * Dangling pointers (calling a pointer which is no longer invalid)

        * Double frees (freeing same memory twice, trashing heap)

        * Data races

        * Buffer over / under flows

        All of these plague languages like C and C++. If you look at CVEs for the Linux kernel (for example) then these issues account for 50% of the bugs. If kernel devs, generally regarded as highly competent coders can have these issues, then what do you think it says of the more garden variety C or C++ code?

        Rust eliminates them by design in the language, or complains at compile time and it doesn't incur any runtime / performance penalty from doing it either.

        That doesn't mean Rust is impervious to other kinds of bug. I could still write code which lowers the garage door when I ment to raise it. I could still write a couple of threads that deadlock on each other. I could still append to a vector until I run out of memory. But I'm still writing safer code than if I wrote it in C/C++.

        Nor do I think Rust devs are lulled into a false sense of security by having a language which doesn't let them write brain damaged code. It means they have more time to focus on application bugs.

        1. Ken Hagan Gold badge

          Re: Rust

          "All of these plague languages like C and C++."

          Plague? Er, no. These are basically unheard of in C++ unless you are interfacing to an external interface that chose "C" calling conventions (typically for portability). That happens, a lot, but I wouldn't describe it as a plague and you can insulate your code by writing a set of one-liners.

          I see no reason for them to be common in C either, but since I haven't really touched the language in a quarter of a century I will let others comment on that.

          1. DrXym

            Re: Rust

            If you think they're unheard of you're living in cloud cuckoo land. Go to a CVE database, pick a C++ project and see what comes up. For example QT has 9 CVEs in the last month, many of which are overflows and NULL pointer issues. And that's an open source project that has many eyeballs looking at it.

            Clearly some people are incredibly defensive about C++ (and C) for reasons that are hard to fathom especially when the flaws are obvious.

            1. Ken Hagan Gold badge

              Well I cannot speak for Qt (*), but since C++ has had fully automated memory management for over 20 years, I don't think the *language* can be blamed for these kinds of bugs.

              (* I did look at Qt4 a number of years ago and found that they were using macros to emulate exception handling, home-grown collection classes, and had a cute little pre-compiler to generate yet more macros. I concluded that if I wanted an MFC-lookalike then I'd probably just stick with MFC. I dare say it has improved since then, but is it perhaps still "bugwards compatible" with the older versions?)

              1. DrXym

                C++11, has unique_ptr / shared_ptr templates in the standard lib which reduce the risk in using raw pointers. But they're not inherent to the language nor is any form of automated memory management. Libs like boost, QT etc have analogous wrappers, e.g. boost::scoped_ptr, QSharedPointer. But this is still an opt-in to safety, not safe by default.

                The language itself has new and delete operators but they must be called correctly. Omit the delete, and it leaks. Delete an invalid pointer and the heap corrupts. Delete an array without the [], and only the first element's destructor is called. Call an invalid pointer and it crashes, or worse, doesn't.

                It's certainly advisable to use smart pointer classes where possible but they are still not mandatory.

                And screwups are still possible. I reviewed some code that assigned on shared pointer to another. The shared ptr on the left was typedef'd to a base class, the one on the right was typedef'd to a child class. Since the typedefs were not the same, the right hand side implicitly fed its inner raw pointer into the constructor of the left hand side and both smart pointers came away with a reference of 1. The first var out of scope deleted the object and left the second looking at garbage.

                The point being none of this is even an issue in Rust. The compiler will insert the memory allocation / deallocation for you and will kick your butt if you violate object lifetimes rules. The result is software that doesn't suffer from an entire category of common programming fault.

    2. JohnFen

      Re: Rust

      "as the claim is it removes memory bounds issues."

      But that's only one class of errors. There is a whole ocean of other errors that can happen.

  8. This post has been deleted by its author

  9. Chairman of the Bored

    I'm glad its not my job.

    I like hard science research because I can ask a tightly defined question and address it with an elegant investigation.

    Research such as the 'least buggy lingo' strikes me as a fool's errand. There are vast numbers of confounding factors one must account for, and I haven't a clue what they all are or how to control them. Just off the top of my head, I think you need to adjust for:

    Experience level of developers in a given language, developer team cohesiveness, team size, development methodology, team morale, quality of dev environment, lack of PHB questioning every decision, availability of automated quality assurance, testing or lack thereof, quality of requirements, stability of requirements, quality of req to function allocation, availability of good libraries, quality CM systems... I could go on for hours; been doing this for decades and literally everything I've listed just now has a tale of woe to match. I've never seen a project fail because the language was "wrong"

    When I think of all the absolutely essential crap in the systems engineering vee above and to the left of coding ... and the V&V activities above and to the right ... you'd have to work really hard to convince me language quirks dominate quality.

    I trust my experts to choose an appropriate tool for a job. I grew up dealing with DoD's "Ada is the answer" abortion and would not wish that attitude on anyone.

    1. JohnFen

      Re: I'm glad its not my job.

      "I grew up dealing with DoD's "Ada is the answer" abortion"

      Oh, man, that triggers flashbacks. I remember spending a fair bit of pain and time becoming competent with Ada before everyone realized that it wasn't really suitable for much.

      1. Chairman of the Bored

        Re: I'm glad its not my job.

        @JohnFen, I think eventually it matured into a reasonable language. But DoD was shoving this on programs when it was - at best - half baked.

        One of my first exposures to adult level work was a hard realtime system implemented in assembler and a customized C language that was already well into EMD ... when the witless wonders in OSD forced a switch to Ada. For want of an optimizing compiler the hardware architecture had to be respun, going from 2 to 5 microprocessors. In compact, power constrained flight hardware. Reliability and every other -ility crashed hard. Large contributor to project failure.

        But I don't blame the language so much as the stupidity on high and the lack of cojones on the PM ... sometimes you've got to say "Hell, no, it won't go."

    2. ibmalone

      Re: I'm glad its not my job.

      Research such as the 'least buggy lingo' strikes me as a fool's errand.

      Coming from that angle, probably. It would however be useful to know what language features lead to less reliable code being written. I'm sure someone will pop up shortly to say that if you're a good programmer and write tests for everything as you go along etc. then you will never have bugs, the real world demonstrates that this is an ideal scenario and there are people writing programs who, for whatever reason, are not producing perfect code. Because the easier it is to get things right, the lower the required overhead of all the supporting factors you mention becomes. Most major languages it seems aren't fatally enough flawed that they aren't usable, possibly through natural selection more than anything else.

      Except of course for JavaScript.

      1. JohnFen

        Re: I'm glad its not my job.

        " I'm sure someone will pop up shortly to say that if you're a good programmer and write tests for everything as you go along etc. then you will never have bugs"

        Anyone who says that is demonstrably wrong. All nontrivial programs, without exception, have bugs. Skill & good process can reduce the number of them, but they can never reduce the number to zero.

  10. W.S.Gosset

    Science

    > This contrary result is how science is supposed to work

    Hear hear!

    Tired of people bleating that this or that is "proven" by "science!", where their understanding of science is people in white coats telling them they're scientists.

  11. Frumious Bandersnatch

    How many times do I have to say this?

    void say(buf, count)

    register char *buf;

    register unsigned count;

    {

    register n = (count + 7) >> 3;

    switch (count % 8) {

    case 0: do { printf("%s",buf);

    case 7: printf("%s",buf);

    case 6: printf("%s",buf);

    case 5: printf("%s",buf);

    case 4: printf("%s",buf);

    case 3: printf("%s",buf);

    case 2: printf("%s",buf);

    case 1: printf("%s",buf);

    } while (--n > 0);

    }

    }

    1. aberglas

      Re: How many times do I have to say this?

      Duff's Device.

      Does this also work in Java and C#?

      1. Yet Another Anonymous coward Silver badge
        Joke

        Re: How many times do I have to say this?

        No only in real programming languages

    2. W.S.Gosset

      Re: How many times do I have to say this?

      Assuming a Uniform distribution:

      at least 12.5% of the time: once

    3. Someone Else Silver badge

      @Frumious Bandersnatch -- Re: How many times do I have to say this?

      Only one question: How old are you?!?

  12. a_yank_lurker

    GIGO

    A good programmer tries to understand the limits of the languages they are currently using and tries to find ways to work around them. Also, a good programmer realizes that each language was designed for a specific set use cases. But some languages are badly designed and implemented which makes writing quality code more tedious. Also some languages are notoriously verbose, so while your error per line of code might be good the overall number of errors may be much higher because of the total lines of code.

    I have not read the study but I have sneaking suspicion that the authors are not programming day-and-day-out. So their understanding of the problems is more theoretical and superficial than practical. IMHO a language that tends towards terse but readable code with a minimum of boilerplate with intelligent scoping rules, and strong typing is one that makes writing good code easier. Many of the classic bugs are forced out by design and implementation.

    1. Tom 7

      Re: GIGO

      Terse but understandable code? That really depends on the problem at hand - many problems I'd have to deal with cannot be written in terse and understandable code because the problem is not terse or understandable even its most simple breakdown. I always used to say to people who said they could write code without a goto to fuck off and write a device driver, or if that didnt work run something on their object code that removed all jmp instructions.

      I grew up in chip design and some code worked on a quantum level in emulating how device worked, When that code didnt work as it should you couldn't modify the code - some of which was the combination of several PhD lifetimes of the best minds in the world, If you found a situation where their code failed you just had to make sure you never called their code in a way that would make it fail.

  13. david 12 Silver badge

    Non the less, an interesting result.

    It tells us that if you get a random library from git hub, it probably doesn't matter what language it was written in. Libraries that required more work to make them work are just as likely to work as libraries that took less work to make them work.

  14. FuzzyWuzzys
    Stop

    What a load of crap!

    Irrespective of langauage, it comes down to your attitude.

    Bad coders with bad practices and a bad attitude, create bad code.

    Good coders, conscienous coders write good, fault tolerant code.

    I know people who think exception handling is something the user should do via emails when they simply spit out a failure message from a piece of code! I hate the idea that a process I write will fail so I probably spend an inordinate amount of time writing handlers, backup and correction routines that will try everything possible before giving up and apologising to the operator/user that process X has failed.

    The reason is that when something fails there's usually about 2 days of blame apportioning emails flying back and forth between people trying to pass the buck, I don't have time for that nonsense. So if I can do everything in my power to avoid all that, people waste less company time and money arguing and more time doing something else, even if it's less productive at least it's not near me or my processes or code.

    1. JohnFen

      Re: What a load of crap!

      " I hate the idea that a process I write will fail so I probably spend an inordinate amount of time writing handlers, backup and correction routines that will try everything possible before giving up and apologising to the operator/user that process X has failed."

      I'm with you here.

      It's amazing how many battles I've had to engage in just to convince some engineers to do comprehensive error and condition checking. The argument "Yes, it's a problem if users do X, but users will never do X" comes up a lot. Decades of experience has taught me that if it's possible for a user to do something, no matter how ridiculous, then sooner or later a user will do it -- and if the program fails as a result, that's a legitimate bug.

  15. pstiles

    But "bugs fixed" doesn't equal "bugs in code" and may not even be proportional

    The methodology doesn't seem sound. Just because someone fixed "x" bugs in a program (or made "x" commits fixing the same bug) doesn't mean that program had "x" bugs in it.

    I've worked on big code bases where actually the buggiest piece of code is the most complicated and _least_ touched because "everyone knows" that they'll only introduce more bugs if they try to "fix" bugs.

    I've also seen *perl* programs and it's obvious why they don't get many bugs fixed - no one dare touch it (write-once-languages...)

    An olde programmmer of yore measured programs in their "brittleness", (and they were Cobol programs mind you) and how you should avoid touching the brittle programs.

    Just give me competent developers who actually like to think and want to learn why they should code "in the best way" and we'll have less buggy code, whatever language we write in.

    1. herman

      Re: But "bugs fixed" doesn't equal "bugs in code" and may not even be proportional

      Ayup, and not to left MS off the hook: A Java or .Net program cannot fix a bug in the JIT compiler or a bug in the garbage collector. Those bugs will never be fixed and will not show in these kind of bug fix studies.

      1. Anonymous Coward
        Anonymous Coward

        Re: But "bugs fixed" doesn't equal "bugs in code" and may not even be proportional

        Good point - the "higher level" the language, it seems, the more likely there are bugs in the language itself. It's not likely you're one of the maintainers who can fix it, but I've found that if you do - and it's a for-real good fix, most projects will thankfully accept it. They mostly reject lazy whining or poor fixes that have their own issues. It's something the fad-boys don't understand - age doesn't make something bad, it means it's survived and probably been looked after better than your latest fad has.

        I could take complaints about perl personally, being somewhat expert in it and having written rather a lot of it for cases where programmer time was more important than cpu cycles. But what I find is that it's an extreme case of "Enough rope to shoot yourself in the foot". Languages DO differ there. But there's another side to that coin. And more than one sort of rope - needing a page to do what one line can do isn't necessarily more clear to the observer. Depends on the line and the page.

        I can show people my own perl code and they ask what language it's in, as it's so clear and obvious what it does - that same freedom to write horrible code can be used to write great, even beautiful, code. True, not that many people do that, but that's an indictment of humans, not a language. I kind of like the freedom to stand out as being hyper-competent.

        Funny that a lot of complaints about perl center around regular expressions, which it kinda helped make popular in a big way. And now they're in all languages, some even boast of perl compatibility. And they still look horrible and shouldn't be used to make the coder feel special about his cleverness...no matter the language they're embedded in.

        It's just not a bubble wrapped world, and I'm glad of that for lots of reasons. There has to be some way to differentiate the competent from the losers. Some encouragement to be better.

        And as a now-old dude who sometimes wants to go back and modify or perhaps reuse some of my older code - having forgotten just how I approached something, I've long since learned how to do really good naming and comments, and do so religiously. I'll even put in a section that describes "what was I thinking" as often as not.

        And I find doing that makes any language a lot better than your average garbage on github. Or websites with stack in their name.

        If you see something like "i++; # increment i" in my code, you have permission to beat me senseless.

        unless (code_is_clearly_making_sense) while(coder_still_breathing) {swing_club(target => coder);}

        Ok, I left out the sigil noise...but is that write-only?

  16. Paddy

    On reproducibility.

    I would expect github has an API for data extraction. I would hope that this second team of researchers also created a Jupyter notebook (In Python, of course), able to reproduce the statistical results they mention. (At some snapshot in time). This should help in later reproducibility issues and allow a new group of researchers to spend more time on criticising methodology, or showing later github trends.

  17. bytemaniak

    Reminds me of...

    Those functional programming turbonerds that downtalk every other language in existence because it "allows you to write bad code" while X functional language doesnt. Could it be because the language is so rigid and inflexible that the moment you try to do something that would make total sense in another language, the compiler flings crap at you?

    1. herman

      Re: Reminds me of...

      In my experience, the kind of problems that can actually be coded in a functional language are so simple, that you don't need a functional language to code it.

      I have never seen a useful project done with a functional language. Every place I know that tried a functional language, eventually gave up, fired all the culprits who propounded the damn thing and started over in C.

      1. Anonymous Coward
        Anonymous Coward

        Re: Reminds me of...

        Your experience is perhaps a bit limited. The Haskell compiler (GHC) is written in Haskell, and compiling Haskell is not a simple problem.

    2. Someone Else Silver badge

      @bytemaniak -- Re: Reminds me of...

      Those functional programming turbonerds that downtalk every other language in existence because it "allows you to write bad code" while X functional language doesnt.

      Next time you run into one of these turbonerds1, ask them if they've seen any of the gawd-awful (and buggy) SQL that is floating around the intertubes.

      1 "Turbonerds". Love it! Can I use it?

  18. Michael H.F. Wilkinson Silver badge

    Interesting, but ...

    There are many complicating factors, many of which have been noted already. A point I haven't seen yet is the issue that maybe more of the code written in "old school" languages was written by older, more experienced coders, whereas the newer languages which might have better design of themselves are more likely to have been used by less experienced coder. Not sure if this can readily be tested, or whether it has an effect.

    There is also the issue that compares to owners of safer cars tending to drive less safely, because they feel safe in their car. Likewise, I know that when I needed to program in assembly, way back when, I was FAR more careful about what I was doing, and checking and double checking my reasoning before even starting to write. Indeed, I was careful to limit the usage of assembly to an absolute minimum, to handle some hardware issues. Of course I managed to crash my machine a number of times (in the "good old" days of MS-DOS) a couple of times when testing the code, but the production code I delivered generally didn't cause any issues. "Back in the safety" of Pascal (the compiler won't let you shoot yourself in the foot), I relied much more on the compiler or run-time system giving me sensible error messages ("integer array index out of bounds" is SO much more useful than "segmentation fault"), than with either assembly or C. So maybe people writing code in "safe" languages don't pay as much attention to any remaining pitfalls as those who know they are walking in a minefield. I am not sure this is the case, but it might be worth considering.

    Furthermore, quite apart from how difficult it is to fix things in brittle code, there is the issue of actually finding errors in poorly-written code, or hard to read languages. So number_of_bugs_FOUND != number_of_bugs_in_code.

    Finally, I have had to write bug fixes that weren't fixes for bugs in MY code, but workarounds for problems either with a compiler or a run-time library that wasn't open source. I once was working on MS-Pascal code in which I knew the linked lists used had an even number of nodes. Therefore, if the "next" pointer in a node wasn't NIL, I could safely jump two nodes on, which in Pascal would read:

    current := current^.next^.next;

    which caused the program to crash. I replaced the above with:

    current := current^.next;

    current := current^.next;

    so again, two jumps without NIL pointer test in between. This worked flawlessly. Clearly, the compiler couldn't handle the double indirection in the first version. I tried both versions on a different Pascal compiler, and both worked flawlessly. Again number_of_bug_FIXES != number_of_bugs

  19. Charlie Clark Silver badge

    Impossible question

    You often don't know that your code is buggy until it's exploited and while static code analysis may pick up some obvious flaws, most bugs are waiting to be discovered as we've seen over the last few years.

    More important, therefore, is to list common gotchas: memory management, handling of untrusted data, equivalence tests, etc, and the strategies used to deal with them.

    Also, things don't stand still. It's no accident that PHP has so many CVEs filed against it: up until a few years ago it was the goto language for web development and unattracted a lot of untrained people as a result. Many of them have since moved onto other languages (JS for web and app development, Python or R for data analysis). But threats have also changed: SQL injection is perennial but, hopefully, less of an issue today, but now we have more attacks on transport and handling.

    1. Dave_uk

      Re: Impossible question

      "You often don't know that your code is buggy until it's exploited..."

      Well said (while others above praise there code without REALLY knowing!).

      Publish your code as open source and you soon find out where the exploits are.

  20. Hemmels

    Powerful tool

    I for one would love to know how they can scan github for "buggy code".

    And then I can fix all my issues in realtime.

    1. Nick Kew
      Joke

      Re: Powerful tool

      while ( /\bbug\b/i ) bugcount++;

  21. karlkarl Silver badge

    Sure C is not a "safe" language. It was written to be portable assembler, not for safety.

    Now for the stupid part... C#, Javascript, Java all have massive sprawling VMs required to interpret and Jit them. What is that mostly written in? And why do you think that is?

    Yes, I am a C developer... yes, C is a little awkward to use but sometims there is *zero* alternative.

    Other languages are "an illluuusiiion!"

  22. FlippingGerman

    Typo

    "the broader question question"

    Interesting article.

  23. wobbly1

    I broadly concur with the debunking. my ability to generate errors is pretty even across all the languages i have used , with the exception of Pascal ... where my error rates were 2-3 times those in other languages,including TI990 assembler including entry errors translated to binary to be entered a bit at a time via 16 push buttons. I have a deep loathing for Pascal.

    1. This post has been deleted by its author

  24. scarper

    The best language is...

    ... the one that allows you to say what you want in the fewest lines. Most of these bug studies deal in bugs per line, not bugs per concept.

    And the winner is ... whatever language has a library (or libraries) for the thing you're trying to do ! Library code is a lot more debugged than newly minted code, and it doesn't take many lines to invoke the goodness.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like