back to article 'Boss, I've got a bug fix: Nuke the whole thing from orbit, rewrite it all'

Hello, world. Welcome back to Line Break, our weekly column of terrible code readers have spotted in the wild – think of it as a group therapy session. Let's skip to the main() course. Beauty or the beast Since SSL/TLS is in the news again, let's start with some weird code spotted by Georgi in OpenSSL and LibreSSL, a fork of …

    1. Naselus

      Re: Beastly, Just Beastly

      "Goto's are evil, only bone idle spaghetti loving coders use them."

      That's a little too black-and-white. tbh, GOTOs are a tool, and refusing to use them outright because they shouldn't be used for everything is like saying you should never use a bucket for anything because it's not suitable for frying bacon in. But sometimes I'm not making bacon, and a bucket is actually quite useful.

      I don't see anything particularly wrong with the sparing use of GOTO - if the whole structure of your code relies on them then yeah, it's a horrific nightmare of spaghetti pain (I still wake up screaming after dreams of writing programs in BASIC back in the '80s), but legibility problems only really begin to creep in when you have GOTOs within GOTOs - if you're using GOTO to reach a block which just terminates the program then it's not hard to follow. If you end up following a string of GOTOs (goto 8, do x and goto 24, do y and goto 4, do z and goto 347) then that's when it becomes horrible code.

    2. Anonymous Coward
      Anonymous Coward

      Re: Beastly, Just Beastly

      As a wise person (Parnas? Hoare?) once wrote: the problem with goto is that there is no comefrom.

      1. Anonymous Coward
        Anonymous Coward

        Re: Beastly, Just Beastly

        not strictly true

        goto amounts to jump which amounts to "write to the program counter"

        comefrom could be implemented at least for toy purposes by misusing setjmp/longjmp or having the equivalent assembler to dump the program counter to some other location (longjmp unmodified will not work out of the box)

        1. HwBoffin

          Re: Beastly, Just Beastly

          Just use call, and if you don't want to return, add to the SP to remove the return address from the stack.

          Beware that different architectures may behave differently (flags saved on call, hardware call stack, so on)

  1. SVV

    There's a fine line between smart and stupid

    Actualy there isn't : a cursory glance at the source file linked to will be enough to disprove this.

    Highlights include functions hundreds of lines long, a typically awful attitude to variable and function naming and a scarily random attitude to comments in code. This cheesy flow of control logic illustrated in the code snippet is far from "cute". It could have been far neater by splitting the code chunks marked by labels into seperate functions (as is usually the case when gotos start appearing in C code).

    Of course implementing SSL is not trivial or without complexity - but as much clarity as possible is always the smart option, adding unnecessary levels of complexity to an already complex problem is always stupid.

  2. Filippo Silver badge

    Screw cuteness

    It doesn't have to be cute, it has to be readable and maintainable. If I see if(0), me and 99% of everyone else are going to assume it's not getting executed. We'd be wrong, sure, but are we here to lay clever traps for each other in some kind of free-for-all intellectual superiority deathmatch, or are we here to get a job done?

    In the same vein, I don't understand people who think that the shortest code is the most elegant, and write stuff that can only be understood if you know every obscure operator precedence rule and can mentally apply it to a dozen operands at the same time. What exactly is the beauty in that? Do you think you'll run out of brackets?

    1. John G Imrie

      Re: Screw cuteness

      There is a massive 24 level deep operator precedence for Perl. However most Perl programmers remember that multiplication and division come before addition and subtraction and use brackets to sort out the rest.

    2. PassiveSmoking

      Re: Screw cuteness

      Code like this is what's often referred to as job-security code. If you want to make yourself indispensable to your employer you can do it by either being so good at what you do that letting you go would be a horrible mistake, or by writing code that's so befuddled and obscure nobody else could ever possibly understand it. The latter is a lot easier to do than the former.

      regarding "if(0)", the similar "while (true)" is acceptable at a pinch because that's shorthand for "There's code here that should run in an endless loop but which can exit for any number of reasons, not just one single one I can express in a conditional while", though it's probably still best avoided if possible because it's too easy to forget to trigger an event that will actually exit the loop. "if (0)", on the other hand, is purely job-security. Most programmers would fail to notice that the code in the condition is not dead code and quite a few would even mentally block out the if (0) because it's so out of synch with most programmer's expectations it can fail to even register.

  3. Naselus

    "He explained to me that, for every new program, he would change every single one of the library's functions completely so they would suit the new program."

    ... I honestly cannot figure out why anyone would do this.

    1. Andy Nugent

      I've had interviews for contracts in a couple of places that didn't actually use any source control systems (I never actually took the work out of fear for my sanity; they were never looking to do things properly, just wanting someone to add in a couple of more features ASAP). Usually engineering firms where it started out as a simple bit of code that the (non-software) engineer wrote and it just grew and grew. Not quite as bad as this, but going down the same road.

    2. #define INFINITY -1

      What's even more amazing is that this one hasn't yet had any comments. I understand what others are saying about being 'cute', but El Reg gave us a soft target. How about some open source project that hasn't had so much bad press.

    3. Natalie Gritpants

      Really? If your code is valuable and no-one else can compile it you have a serious barrier to redundancy.

      1. cnsnnts
        FAIL

        Code that won't re-compile?

        Or a fast track to dismissal on competency grounds.

        Along with the manager who didn't put efficient rules and practices in place to make sure that all software written could be be built independently of the author and their development environment.

      2. Naselus

        "Really? If your code is valuable and no-one else can compile it you have a serious barrier to redundancy."

        That's true. Unfortunately, this guy can't compile it either. He's compiled it once and can never do so again without re-writing the whole damn thing. This is not an anti-redundancy measure, this is just insanity.

        If I had a password only I knew on my large collection of immaculately-versioned libraries, then I have a serious barrier to redundancy. If my boss comes to me 3 months after I've published the program and says 'hey, we found a bug here, could you fix it and re-release' and I say 'um, no, the library file for that does not exist anymore', then I have a serious barrier to ever being hired anywhere ever again.

    4. cnsnnts
      Devil

      "... I honestly cannot figure out why anyone would do this."

      Have you never had an email from someone where they replied to the last irrelevant email they got from you instead of typing in your email from fresh?

      Have you ever checked the document properties of a report or presentation and found they relate to a completely different document, author and company because someone just grabbed the nearest thing and recycled it?

      1. Naselus

        "Have you never had an email from someone where they replied to the last irrelevant email they got from you instead of typing in your email from fresh?"

        The analogy doesn't really work. This is more like someone sent one email in 1986 and has just been forwarding that email every time he needs to send a new one to someone, only with every single detail changed, for the last 30 years. And has been deleting the old version each time and then forwarding the new copy.

        If his reasoning was along these same lines, I'd have expected to see his 'library' containing every function relevant to every program he'd ever written, rather than being endlessly stripped out and rebuilt. t's like no-one explained how to reference a library to him, and so he's been re-using that one file rather than learning how to reference in another... but if that were the case, I don't see how he was ale to make anything even functional tbh.

        This doesn't make sense from the point of view of laziness, or for job insurance, or really from any point of view at all. It's frankly just deranged.

    5. Charlie Clark Silver badge

      ... I honestly cannot figure out why anyone would do this.

      It seems pretty obvious to me. For the guy each project was completely separate from the rest so he felt justified in starting from scratch but with the same approach and files as before. By repurposing the library he was able to write his application code the way he liked to.

      I don't agree with this approach but I can understand it and I know that I've done similar things in the past, though usually copying something like utils from A to B. The real problem seems more systemic: why no VCS? And why were other developers charged with recompiling his stuff?

      1. Naselus

        " The real problem seems more systemic: why no VCS? And why were other developers charged with recompiling his stuff?"

        I don't think we can really assume anything with his approach tbh. He's writing in C using a methodology which is more suitable for bare-metal coding. He's apparently never considered the idea of version control, or using backups. This sounds like self-taught, but self-taught to an insane degree - like the guy has never heard of other programmers or read a style guide. I wouldn't be surprised if I heard he was writing the whole thing to work on a 9-bit system he's built himself out of tin cans and broken lightbulbs.

        1. Anonymous Coward
          Anonymous Coward

          Tom, here, original author of the story on "Assembly man".

          The guys was:

          1- very very sociopath, so would never talk to anyone, would even work alone in a locked office. Mad

          2- totally self taught, indeed. And indeed to worrying levels

          His background was serviceman, not software dev. He was probably recycled here.

          The story started when some minor tweaks to his programs were required, and probably noone was able to understand why he would take so long to do any of them.

          Then I was sent to investigate.

          As someone else has said, the guy, indeed was using static libs without knowing their purpose, therefore wrongly. It probably never occured to him programs sometimes have to be compiled again, due to modifications or platform change.

          That was my conclusion when he exaplained its usage of libs. I remember this moment, locked in his office with him, staring violently at the ground, in a ghastly silence :-)

  4. tiggity Silver badge

    polishing one

    If using such if (0) { tricks could at least make it more readable

    e.g. define a macro named

    START_MAKE_CODE_INACCESSIBLE_EXCEPT_BY_DIRECT_NAMED_GOTO

    that macro would then hide "if(0) {" in something more explanatory

    And another macro

    END_MAKE_CODE_INACCESSIBLE_EXCEPT_BY_DIRECT_NAMED_GOTO

    That would just be closing brace "}"

    Still horrible code, but by using those macros, easier for casual inspection to give folk a clue, proving you can polish a t*** (albeit only slightly)

    Back in the day, macros were used a lot for readability and often hid real nasties such as that when macros actually inspected!

    1. Morzel
      Alert

      Re: polishing one

      Preprocessor macros are probably the second best way to shoot yourself in the foot after gotos (if not the best way).

      They can be genuinely useful when implemented properly, but I've seen far too many cases where it went pear shaped due to macro expansion issues (hint: braces and/or brackets), or unintended macro redefinition issues somewhere in the include path - especially when cross-compiling stuff.

      So: use them wisely and sparingly :-).

    2. Mage Silver badge

      Re: polishing one

      Macros can be much more evil than GOTO and only exist because C was partly conceived as a machine independent Macro assembler to make porting UNIX simple and easy.

      Numerous ways to shoot off your legs with a Macro.

      Learning C++, we were told they shouldn't be used at all, only part of the evil backward compatibility. I learnt C++ in 1988. Why after nearly 30 years are we still using C?

      1. Anonymous Coward
        Anonymous Coward

        Re: polishing one

        "Why after nearly 30 years are we still using C?"

        You answered your own question - its a macro assembler and for low level to the metal coding you want explicit code. What you don't want is a load of implicit actions and hidden memory allocs/deallocs going on behind the scenes which is what you get with sufficiently complex C++ when you may be severely restricted in the machine facilities you have available and also want to know exactly what will happen at runtime.

        Also some people simply like explicit code and will put up with the pain of re-inventing the wheel occasionally in C if it means they don't have to use C++. I'm not one of them - just saying.

        1. Mikko

          Re: polishing one

          I understand why massive projects like Linux will not move to C++ (as long as Linus is around, at least).

          I think the more important question is - why is commonly-used, security-critical software like OpenSSL written in C? Why in the world would you want to be in a situation where that goto code starts looking appealing due to the limitations of the language? Or have to always loop manually over the elements of some array or data structure (which you had to roll on your own)?

          Why would the only option to ugly C be C++ that uses all the complexities and gotchas of the language, when for a long time now you can write C++ in a language subset that is much more clear and simple, and completely avoids several of the most popular types of buffer overflow/pointer math/off-by-one issues of the equivalent C code. The horrific "implicit actions" and memory allocations need to be understood and controlled, true - but any security-critical, commonly-used software surely needs to have a very strict coding paradigm anyway, and the contributors need to have a high level of expertise.

          Or if you are not limited to C-like languages, pretty much any reasonably-popular language would be a better choice than C...

    3. PassiveSmoking

      Re: polishing one

      // Happy debugging, suckers

      # define true (rand() > 10)

      1. Fibbles
        Coffee/keyboard

        Re: polishing one

        You evil bastard.

    4. eay

      Re: polishing one

      hmmm this brings back memories of when I was learning C, the Pascal programmers were all using macros

      #define PROCEDURE void

      #define FUNCTION int

      so their functions could look like Pascal

      If you don't know what if (0) {} does, either a) don't use C, b) don't use macro's to hide that ignorance.

      If you want to look at abuses of C that you should understand, lookup Duff's device.

      The C language is small, goto's and labels are part of the language, get over it.

  5. Anonymous Coward
    Anonymous Coward

    I don't know about that first one, it has a twist of genius to it but at the same time is an abomination that should be struck down. The second one is so bad as to be almost unbelievable but I've worked with programmers that were that bad so I'll certainly give it the benefit of the doubt.

  6. Mark #255
    Headmaster

    Inconsistent goto paths

    The thing that gets me about the SSL code is that the two paths (bad packet length and cert length mismatch) have their meat in different parts. truncated could easily be in the earlier if block (like the other one is), and then simply goto f_err; (like the other one does).

    And (assuming a1 is zero when there's no errors), you could write the final if(0) as if(!a1) for better readability.

  7. Flocke Kroes Silver badge

    Aßembly (without the donkey :-)

    It didn't look particularly like Z80 or 68000. It screamed 6502 to me.

    I regret that I do not have to source code to the most horrible bug I have ever found. The project involved using DSPs to send position data regularly by radio (pre-GSM), but with each radio transmitting at a different time. I had time from GPS, so sync was easy. Apart from a little assembly language for the signal processing, the bulk of the code was in C. The kit worked fine in the office. Hours of testing with no problems at all. On the next day, I took it to the customer for a demonstration, and the kit sat there doing nothing.

    After some frantic bug hunting I discovered the fault was in the C library for the DSP. The 32-bit add function was made out of a pair of _signed_ 16 bit shifted add instructions. The DSP was massively optimised for 16 bit variables (and fixed point arithmetic). CHAR_MAX was 32767 (sizeof(char)==1 by definition in C). As a result, almost all of my C code used 16-bit (int) variables and long int was only ever used for the time. During testing, bit 15 was zero, but the demonstration was at a time when bit 15 was 1.

    The easy bit was fixing the code during the customer's lunch break. The difficult bit was explaining what I had fixed.

    1. el_oscuro

      Re: Aßembly (without the donkey :-)

      Definitely looks like 6502. Biggest clue is the A, X, and Y registers 6502 has. Z80 has more. I've never programmed 68000 but it probably has more too.

  8. Francis Vaughan

    C is a glorified assembler

    The trouble with C is that it is little more than a glorified assembler. The language lacks a significant number of structures that we take for granted in more modern languages, structures that make life easier when building more complex code.

    But you can convince it to do things that are useful. Often with gruesome distorted bits of code that at first sight make little sense. The problem isn't that you do this, it is if these bits of code are not rigorously codified and documented, and wrapped up in a clear and maintainable manner. Macros can be one way of doing do (with a lot of care). But clear and enforced coding rules for their use is the key. There should never be a time when anyone looking at the code is in any doubt about what is being done and why. If the idiom is useful, I would expect it to be used everywhere as the standard mechanism for such condition handling. Personal idiosyncratic use of such features is the problem, and speaks of deep problems in the project.

    The failure in the SSL code is not the use of the code, but rather that it is blandly written with no apparent consistency with the rest of the source. Just a one time use of an idiom. That suggests almost zero care, and is deeply worrying as a pointer to the quality of the rest of the code.

    1. David Robinson 1

      Re: C is a glorified assembler

      C "a language that combines all the elegance and power of assembly language with all the readability and maintainability of assembly language"

      http://catb.org/jargon/html/C/C.html

    2. regadpellagru

      Re: C is a glorified assembler

      "The failure in the SSL code is not the use of the code, but rather that it is blandly written with no apparent consistency with the rest of the source. Just a one time use of an idiom. That suggests almost zero care, and is deeply worrying as a pointer to the quality of the rest of the code."

      I don't think this is the only problem with openssl.

      As has been pointed out in http://www.openbsd.org/papers/bsdcan14-libressl/, by Bob Beck, a lot has to do with:

      - every single function of the API is public. Crazy design if you want my opinion.

      - the API assumes the OS provides nothing, no randomness, no calloc(), nothing, then goes on doing everything in a terrible way, all of this to be compatible with platforms long gone and forgotten

      1. Ken Hagan Gold badge

        Re: C is a glorified assembler

        "all of this to be compatible with platforms long gone and forgotten"

        It didn't even get this bit right. If a function is missing on your platform, your response should be to provide *exactly* the function that is missing and *only* use your implementation on that platform, so that everyone else is allowed to enjoy bug fixes as and when their platform provides them.

    3. Anonymous Coward
      Anonymous Coward

      Re: C is a glorified assembler

      "Personal idiosyncratic use of such features is the problem"

      As long as it can be understood why is it a problem?

      "and speaks of deep problems in the project."

      Oh rubbish. It simply speaks of different coders having different approaches. There's nothing worse than a project thats micromanaged down the tiniest detail of the way the code is written but some anally retentive manager who thinks he can do the job better than the people actually doing it.

      1. Gene Cash Silver badge

        Re: C is a glorified assembler

        > As long as it can be understood why is it a problem?

        Because some other bloke probably *won't* understand it, having never seen it before.

        It took me a bit of time to wrap my head around the "if (0)" trick, and made me remember asshole coders who loved doing things like that just to show off the fact they were a smartass with a keyboard.

      2. Charlie Clark Silver badge

        Re: C is a glorified assembler

        "Personal idiosyncratic use of such features is the problem"

        As long as it can be understood why is it a problem?

        It's in the definition of "idiosyncratic" – similar only to itself.

        For example, I could always redefine my booleans. Would you like to work with my code?

      3. PassiveSmoking

        Re: C is a glorified assembler

        "As long as it can be understood why is it a problem?"

        And this, ladies and gentlemen, is everything wrong with the software writing profession summed up in a single sentence. It's almost beautiful in its elegance.

        YOU might be able to understand it NOW, but what about in 10 years time? And what about the poor sap who inherits your codebase after you've departed your current employment?

        Architects wouldn't brag about how esoteric their blueprints are, in fact an esoteric blueprint is a hallmark of utter failure. Obfsucated source code should be treated with the same contempt.

  9. Dom 3

    C-as-assembler

    My favourite example. Came as example code with a microcontroller, IIRC. Needs to be viewed in a monospace font to truly appreciate it.

    unsigned char i2c_rd(void) // read an 8b streaming

    { unsigned char bit_count = 0 ; // bit counter the 8b streaming

    SDA=1 ; P1M2=0x05 ; // prepare SDA as input (=1)

    while(bit_count<8)

    { eep_buf=eep_buf<<1 ; // shift left 1b eeprom data buffer

    dly_usec(4) ; SCL=1 ; // rise-up SCL

    shift0=SDA ; dly_usec(4) ; // read bit_n from eeprom

    SCL=0 ; dly_usec(2) ; // pulse SCL

    bit_count++ ; dly_usec(2) ; } // increment bit counter(repeat for 8b)

    P1M2=0x0d ; return eep_buf ; // SDA open drain(return data buf)

    1. Charlie Clark Silver badge

      Re: C-as-assembler

      Needs to be viewed in a monospace font to truly appreciate it.

      Wrap in a code (pre for double linespacing) block to do that.

      unsigned char i2c_rd(void) // read an 8b streaming

      { unsigned char bit_count = 0 ; // bit counter the 8b streaming

      SDA=1 ; P1M2=0x05 ; // prepare SDA as input (=1)

      while(bit_count<8)

      { eep_buf=eep_buf<<1 ; // shift left 1b eeprom data buffer

      dly_usec(4) ; SCL=1 ; // rise-up SCL

      shift0=SDA ; dly_usec(4) ; // read bit_n from eeprom

      SCL=0 ; dly_usec(2) ; // pulse SCL

      bit_count++ ; dly_usec(2) ; } // increment bit counter(repeat for 8b)

      P1M2=0x0d ; return eep_buf ; // SDA open drain(return data buf)

  10. SPiT

    My favourite sort faux pas

    Many, many years ago I had to assist a colleague who had a sort problem. This is in the early days of relational databases on a computer with relatively limited capability. Unfortunately I can't remember all the details (it was about 30 years ago) but he essentially retrieved each record from the database searched for individually to avoid sorting and taking into account the database performance and his program's internal algorithms the execution was order N^4. He requested my assistance because his program worked fine when tested on a 10 record example (although he really should have twigged that taking 2 seconds was rather too long) but when tested on real world data (only 120 records) it just "looped". He did go an fix it when I pointed out that based on his algorithms expected completion time was about 12 hours and it was working just fine. I wish I could remember all the details.

  11. Anonymous Coward
    Anonymous Coward

    Mucking about with the only copy

    The project was split amongst two people -- me and the other bloke. I wrote the i/f layer and he wrote the stuff underneath. We first shipped with dummy innards to test the (embadded) system. When the time came to include, I couldn't find his code in the SCM. So he emailed me the code and it wouldn't compile. Turns out that he mucked about with his only copy.

    1. GrumpenKraut
      Thumb Up

      Re: Mucking about with the only copy

      > ...embadded system...

      I like that!

  12. joshimitsu

    I hate Americanization (sic)

    Writing extensions for ERP software - it's American so it has data object fields "CostCenter".

    Most other extensions also use the spelling "CostCenter" except the one I worked on this week... Two days later I found out I was trying to call "CostCenter" when I should have been looking for "CostCentre".

  13. Anonymous Coward
    Anonymous Coward

    Don't be clever in security critical code

    Whether that if (0) construct is a good idea or not is irrelevant. The fact you have to talk to C experts, and get different opinions, about whether it is a good idea means it should never have been used in this code. In the code for something like 'less' or 'ls', sure I'm on board with that, knock yourself out.

    For security critical code you want maximum clarity and you don't use anything that has a chance of confusing other C coders, even if it means more LOC or slightly reduced performance.

  14. Kanhef

    The biggest problem I see with the OpenSSL code is that it leaves you at the mercy of your compiler/optimizer. You have to trust that the optimizer will properly traverse all possible code paths and not strip out the entire if (0) block as unused/unreachable code. It may work fine for whichever compiler and optimization settings the developers used, but there's no guarantee it will work for everyone else.

  15. Number6

    C is a language defined loosely by a set of rules, any number of which may be broken in any statement.

    I remember acquiring ownership of some 8051 C code. We were just about out of memory and even with optimisation on, the compiler wasn't being very smart. As such, I was playing games with the compiler to try to get the smallest code by rewriting bits of it. I left it all in the comments to blow the mind of whoever came next.

    // amazing what generates the smallest code, isn't it.

    // if (cksum != (rx_end[msgid][1] + (rx_end[msgid][0] << 8)))

    // if (((cksum & 0xff) != rx_end[msgid][1]) || ((cksum >> 8) != rx_end[msgid][0]))

    // if (((cksum >> 8) + (cksum << 8)) != (*(int *)rx_end[msgid]))

    // if ((UPPER_BYTE(cksum) + (LOWER_BYTE(cksum) << 8)) != (*(int *)rx_end[msgid]))

    ptr = rx_end[msgid];

    if ((*ptr != UPPER_BYTE(cksum)) || (*(ptr + 1) != LOWER_BYTE(cksum)))

    It's only a few bytes difference but that was enough to help. Looking at the resulting assembly code, it was all about whether the compiler reloaded DPTR (16-bit load on an 8-bit processor) or was smart enough to just increment it as it moved between the low byte and the high byte.

    for completeness, the macros are

    #define UPPER_BYTE(a) ((BYTE) (((WORD)(a)) >> 8))

    #define LOWER_BYTE(a) ((BYTE) ( (WORD)(a)))

    Sometimes the rules go out of the window due to other constraints and the best you can do is write War and Peace in the comments on why the fugly mess is the way it is.

    1. Dom 3

      "GETTING INLINE ASSEMBLY TO WORK":

      http://www.keil.com/support/docs/2308.htm

      1. Number6

        I think that one was done using the IAR compiler.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon