back to article Open source work makes me appreciate software testing. It's not an academic exercise

Perhaps the single biggest aspect of systems building I’ve come to appreciate since shifting my focus from academic pursuits to open source software development is the importance of testing and test automation. In academia, it’s not much of an overstatement to say that we teach students about testing only insofar as we need …

  1. Peter Prof Fox

    Perfection is the enemy

    And trying to combine that with continuous development can't possibly work because they're different mindsets. What's described here is four sub-projects trying to interface with continually changing requirements. No wonder everything is a bit of a stew of will it still work? If you're building prototypes then don't be fussing over minutiae of integration. If you're building a finished product then freeze development and get a fully tested/(know where the weaknesses are) system out of the door so the real world can tell you if your tests were realistic.

  2. steelpillow Silver badge

    "The Systems Lens"

    There are two distinct perceptions of systems engineering.

    My current employer is busy rolling out a shedload of digital systems. "We don't need systems engineers because it's all in rented cloud space." They assume that systems engineering is all about racks and cooling and cabling.

    I studied soft systems - systems which involve people doing stuff, both good and bad.

    It is the soft systems engineer who asks, "do we spend 18 months on regressions or move fast and break things?" That is a people-doing-stuff decision, and there is no one right answer.

    Sometimes integration and regressions are important, sometimes it's better to add test tags to your source code and pour their markers into a log file.

  3. Flocke Kroes Silver badge

    Shifting tests to the left saves time

    Some laws of software

    1) Anything not tested should be assumed broken.

    2) Anything not covered by automated tests will be broken real soon now.

    3) Any breakage left lying around will cause a work around.

    4) Any bug fixed late will cause a work around to fail.

  4. alain williams Silver badge

    Need to check that failures happen when they should

    It is all too seductive for me, as a developer, to write tests that check that certain inputs yield the expected outputs.

    What is much harder to think about and thus write are tests where the s/ware should fail (eg invalid input, conflicting records, ...). What is wanted is that the system should detect these unwanted situations, complain suitably and then proceed normally to deal with more input. If this is not done you can get disasters like earlier this year when UK air traffic control went TITSUP over a bad flight plan.

    1. martinusher Silver badge

      Re: Need to check that failures happen when they should

      I've done most of my work with embedded systems. I learned to deliberately probe for failure conditions -- just because code doesn't fail doesn't mean its working. I've been on several projects where the total test and verification time took far longer than the development time for the component I worked on. Its actually quite fascinating to see the kinds of things that a really skilled test group come up with -- its obviously thought through but it appears to be malicious bordering on psychopathic. (But I rationalize it by saying to myself "They're there to keep us developers honest".)

      (BTW -- The biggest lies in software engineering are "it ran overnight" or "it ran over the weekend". Without some explicit input from 'night' or 'weekend' this is totally meaningless.)

      1. nintendoeats

        Re: Need to check that failures happen when they should

        I have a very real example of this.

        We needed to reliably generate a callback anytime a 3D camera changed position or orientation (there were of course context and other requirements that complexified this).

        The solution was very simple; after every "step" of the display thread, we would check if the transform for the camera had changed and generate a callback if it had. Overall this took maybe an hour to implement.

        The randomized test took a week to write. A few of the complexifying factors:

        1. Most user functions that could change the camera were processed asynchronously.

        2. Some operations which were nominally setting the camera wouldn't actually change the matrix (for example, setting it to the same position twice) and therefore were not expected to generate a callback.

        3. The test couldn't check if that had happened, because requesting information about the camera would force the test thread to synchronize with the display thread, defeating the purpose of the test.

        4. If you wrote to the camera twice in a row, those two operations might get merged. We didn't guarantee that writing to the camera twice would generate two callbacks.

        5. We DID guarantee that the LAST callback would reflect the final state of the camera, so of course that had to be verified in the test

      2. HandleBaz

        Re: Need to check that failures happen when they should

        There are two types of tester.

        The first is "That would never happen in production"

        The other is "I wonder what happens if I put a zip bomb in this image upload field"

        You need both to ship your product on time, and not have it be a disaster on launch.

  5. Julz

    1970's

    Calling. Testing is important; no shit Sherlock.

    1. nintendoeats

      Re: 1970's

      Unfortunately there are still people who need convincing.

    2. doublelayer Silver badge

      Re: 1970's

      It's less that testing is important and that testing needs to be done properly. A lot of places do testing. Fewer places do testing right. For example, it's common to require unit tests on code, but it's also common to have code that isn't improved with unit tests because they test the obvious. You get a function with some pretty basic control flow and a stack of tests that test that, if you have an if statement that calls another function under a condition, that it does in fact call the function if that condition is true and doesn't if that condition is false. This gives you no useful data, because all that test proves is that you've written the condition the same way two times. It doesn't prove that the condition is the right condition for your situation.

      It doesn't even prevent breaking that later. If someone changes that function to have different control flow, the test will probably break. They will be expecting that since they just changed the function. However, with unit tests that only test units, it doesn't necessarily tell you that there is a caller somewhere which is counting on the old control flow and is now doing something wrong, because the unit test on that function specifies what the result is and only checks what it does with that result, which is still correct. To get that information, you need wider tests that test the interaction between units. Thus, you can do a lot of testing and still do testing wrong.

      1. nintendoeats

        Re: 1970's

        You gesture here towards the relationship between design, documentation, and tests.

        Say there is a function `int AppendFile(Filename, Message)` with no other documentation. How do you test this function? There is the obvious "verify the contents of the file just written", but what else? Say the file does not exist, should the function create it? If that is an undocumented situation Then the best you can do is try it few times and decide that whatever it does now is "correct".

        One of the many drawbacks of this is that you don't know what is undocumented behavior (still possibly needs to be tested) and what is undefined behavior (doesn't need to be tested). Arguably, since the user has been told nothing, all of this is both undefined and undocumented...poor test writer now gets to guess at which is which.

        A function with no documentation makes no promises and has no bugs.

      2. PRR Silver badge

        Re: 1970's

        > It doesn't prove that the condition is the right condition for your situation.

        I sometimes wonder how few programmers get ANY exposure to any other kind of serious testing.

        Electrical testing is a whole group of professions, outputting like UL and NEC manuals for the use of industrial and municipal Inspectors. A lot of electrical Practice seems odd until you realize it is not about what Goes Right, but about the many-many-many Things That Go Wrong. The electrons do not care what color the insulation is but (since 1917) that has been a large part of the electric system.

        Cars are intensively test-tracked, to improve (not just prove) the product. The 1934 Plymouth is a pretty good car because around 1930 Dodge Bros set up test tracks with paved surface, potholes, cobblestones, dirt, and that crappy area behind the railroad switchyard. They beat shock absorbers/dampers to a pulp, changed design, and pulped some more. They flipped cars: with thin leather helmets, without rollbars (without seatbelts in some tests). "Flipping should not happen in daily driving!" and yet it does (yesterday a Jeep here on Bridge Hill). Most other carmakers followed. Saab rolled cars down ski-jumps. Lexus has robots to torture suspensions. They say BMWs can crash on the Autobahn without killing the owner because no-fake-jake testing.

        Software testing *seems* to be like: "is every part connected at both ends?" Yeah, we just lost a wheel on the Honda b/c one of the two (2) connections didn't survive 200k+ miles AND turned out to be un-inspectable (no fault found per shop manual procedure). I'm sure it was flawless in the pristine test-lab with pristine test-technicians; not so in real life on real roads with real drivers.

  6. nintendoeats

    I find the focus on code coverage to be spurious.

    My former boss once mentioned that she had run a tool to check our codepath coverage and it was getting close to %100.

    I observed that one of the bugs that had been reported recently was `if the user makes a box editable, all subsequently created shapes will default to the colour blue`.

    "Do any of our tests actually check that the default color is unchanged?"

    We concluded that they did not.

    Code coverage is useful as a tool for making sure you don't crash. It's not at all helpful for making sure that your code does what you expect; your `verification coverage` so to speak.

    1. HandleBaz

      Code coverage is basically as useless as a measurement of quality as SLOC is for developer productivity.

      In both cases, if it is 0, you have big problems, but other than that it doesn't tell you much.

    2. SCP

      Code coverage is useful as a tool for making sure you don't crash. It's not at all helpful for making sure that your code does what you expect; your `verification coverage` so to speak.

      Generally testing should be requirements based - that is you should be testing that the requirements are met. If when you run the tests you monitor the code coverage you should get an indication of how complete your testing is. For example, if you believe you have fully tested all the requirements you would expect to have achieved 100% code coverage (to the 100% MC/DC level if you are being particularly thorough). If you have not achieved 100% coverage it would imply that you have something in the code that is not related to its requirements, or that you have missed a requirement. It is important to establish what the cause of the discrepancy is and correct it - you should not simply create a test to tick the coverage box.

      Not achieving 100% coverage is often an indicator of a deficiency; but achieving 100% coverage is not necessarily an assurance that you have covered all the requirements thoroughly.

      But then testing is not necessarily an effective way of discovering errors - but as part of an overall Verification and Validation Plan it can be a useful tool.

      1. nintendoeats

        Last time I was asked to write a test for something, I found 4 "cannot ship, completely broken" bugs in 2 hours (plus an edge-case erro for good measure).

        Testing is a very effective way of discovering errors...if the person writing the tests is ready to be properly psychotic.

  7. Fishbowler

    Coverage

    Spot on.

    I can write a test that exercises all of the code, then asserts that 1=1. I've proven that all of the code runs, but asserted no behaviour of the code whatsoever. Code coverage is, at best, a fallible proxy for good tests. Low code coverage definitely implies risk. High test coverage _might_ mean lower risk, but you should check before gaining any confidence from it.

  8. jonorolo

    Who is the test for?

    When you're giving a talk it's important to know who your audience is and what they care about. When you're writing a test it's the same. A test is a message between developers, literally a coded message. So it's not just about the software it's also about the people, their relationships, and what they want.

    Does your recipient just want basic trust before they invest their time? Then you need a smoke test.

    Does your recipient have a limited brief to change a library but not its API? Then you need a unit test.

    Are you trying to show a customer that their pet peeve you fixed last week is still fixed? Then you need a regression test.

    Does your recipient just want to see that two teams actually talk to each other? Then you need an integration test.

    Like giving a talk, it clarifies a test to think about the audience, their needs and capabilities, and what you want them to take away.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like