Should you test trivial code?

Developers have wasted way too much time debating whether they should test trivial code. And that debate doesn't really seem closer to an end today than it was yesterday—or last year.

Betteridge's law of headlines states that every headline that consists of a question can be answered with "no." It doesn't look as if today is the day I'm proving that law wrong. My take is, no, you shouldn't test trivial code.

But that doesn't mean anything without adding some context. And the issue can evoke several other questions:

  • What kind of testing am I talking about here?
  • What does "trivial code" even mean?
  • How did this debate come to be?

To put an end to the debate once and for all, here are the answers to these questions.

Gartner Magic Quadrant for Software Test Automation

What a test is, and is not

First of all, I should define what a "test" is. It should come as no surprise that I'm talking about automated testing, particularly the unit type. To understand what trivial code is supposed to mean and why this debate even exists, let's talk about test-driven development (TDD).

TDD is a software development methodology in which unit tests drive the design of the application. Robert C. "Uncle Bob" Martin defines the process of applying the TDD methodology with his famous three rules:

  • Write no production code unless you do so to make a failing unit test pass.
  • Don't write any more of a unit test than is sufficient for it to fail—and compilation failures are failures.
  • Do no write any more production code than is sufficient to pass the one failing unit test.

Kent Beck developed (or "rediscovered," as he likes to put it) TDD way back in the '90s. Controversial at the time of its introduction, the methodology has grown more accepted in the industry since then. But one debate that has never quite settled down is the old question, "How much test is enough?"

In other words, what really needs to be tested?

Martin, despite being one of the fiercest champions of unit testing and TDD, contributed to the debate with his blog entry "The pragmatics of TDD."  Martin doesn't test-drive everything. Here's his list of scenarios where he doesn't use TDD:

  • Getters and setters
  • Member variables
  • Functions that are obviously trivial
  • GUIs
  • Code written by trial and error
  • Code written by third parties

The list might seem reasonable enough, but it has generated a fair amount of controversy.

Dogma strikes back

One reaction came in the form of a blog post by Mark Seemann, concisely titled "Test Trivial Code." He thinks rules about not testing GUIs and third-party libraries/frameworks are reasonable, but disagrees about the remaining items, which he groups under the "don't test trivial code" moniker.

He then addresses why he thinks Martin's reasoning is wrong. Starting with what he calls the "causality" argument, Seemann say that since the TDD rules dictate that you should start implementation of a feature by writing a failing test, you can't, by definition, know if a certain feature will turn out to be trivial.

He then states that Martin's example about getters and setters is strange. Seemann explains that the whole reason to use properties is to preserve encapsulation. By using a getter, you can later change the way the value is retrieved or computed without the client even being aware of such a change. But to be able to make the change confidently (in other words, to refactor), the developer would need tests in place for the property.

Finally, he gets to the learning argument. According to Seemann, giving learners of TDD a "way out" of the discipline is bad, since they would use that as an excuse whenever things get more challenging. And if you were to give TDD practitioners a way out, it should be via a measurable and objective rule instead of a "fluffy" one:

"A fluffy condition that 'you may be able to predict that the implementation will be trivial' isn't at all measurable. It's exactly this way of thinking that TDD attempts to address: you may think that you already know how the implementation is going to look, but letting tests drive the implementation, it often turns out that you'll be surprised. What you originally thought would work doesn't."

The return of pragmatism

Seemann's post generated strong reactions. Some were polite and well-written; others not so much. Here's Mark Rendle's post, from which I quote from the TL;DR version:

  • Don’t test other people’s code.
  • Don’t test implementation details.
  • Don’t pad your test suite with pointless tests.
  • Don’t have code which is only ever called by tests.
  • TDD is awesome; don’t make it look bad.

I agree. Seemann does indeed argue that you should not only test all automatic properties, but also test-drive them. Rendle states that this amounts to testing the compiler, which is something you shouldn't do.

I also agree with the general sentiment of teaching beginners to apply critical thinking and judgment, since not doing so would be to engage in cargo cult programming. However, Rendle's answer still falls short, since it fails to address a point that is, in my view, Seeman's most important one.

Trivial code won't (necessarily) remain trivial

One of the main points of Seemann's post is that code that starts out as trivial won't necessarily remain so. Consider the following scenario:

  • The developer writes new code.
  • Since the code is trivial, the developer doesn't write tests for it.
  • Months (years?) later, business requirements change, and the code acquires complexity.
  • Now the code is nontrivial and untested.

In this situation, how can you be sure the changes made didn't cause any harm? One possible answer is that some tests will fail, somewhere. While the developer didn't write any tests for the (then) trivial code, it must have been exercised indirectly by other tests. Otherwise, it would be a completely redundant piece of code.

While you shouldn't immediately and completely dismiss the value of indirect testing, it might be too much wishful thinking. Now you're pinning all of your hopes on these other tests that may be incomplete, or even flat-out wrong.

What to do instead?

Test trivial code only when you need to

At the end of the day, the solution boils down to two things:

  • Have mechanisms in place to detect when trivial code becomes nontrivial (and write the tests then).
  • Have mechanisms in place to evaluate the quality of the test suite (and do it often).

What would those "mechanisms" be?

For the first one, you could use a mixture of manual and automated code review. Nowadays, there's a plethora of tools that perform static analyses on codebases. You could use a tool to assess the cyclomatic complexity of your code. What about setting up a system that gives you alerts—or even fails the build—when the complexity goes beyond a certain defined threshold?

The other part of this approach relies on human intelligence. Have all code that goes to production written in pairs, or make sure it's code-reviewed. Or maybe both. This isn't 100% guaranteed (spoiler alert: nothing really is), but it will definitely increase the probability that you write the tests exactly when you need them.

And what about the last part—how to evaluate the quality of the tests?

The answer is mutation testing. Mutation testing frameworks work by deliberately introducing mistakes (called mutations) into the source code and then running the tests.

If one or more tests fail, then the given mutation died (which is a good thing). If all tests pass, then the mutation survived. And if the ratio of surviving mutations is too high, that's a sign that you don't have enough unit tests or that the unit tests that you do have aren't very good at all.

Gartner Magic Quadrant for Software Test Automation
Topics: Dev & Test