Artificial Code Coverage

Abstract

In this paper I put forth the proposition that contrary to popular belief, 100% code coverage can be a very advantageous thing to have, and I discuss a couple of techniques for achieving it without excessive effort.

(Useful pre-reading: About these papers)

The problem

Conventional wisdom says that 100% code coverage is unnecessary, or even undesirable, because achieving it requires an exceedingly large amount of effort for the sole self-serving purpose of achieving coverage instead of asserting correctness.

Let me tell you why this is wrong, and why 100% code coverage can indeed be a very good thing to have.

If you don't have 100% code coverage, then by definition, you have some lower percentage, like 87.2%, or 94.5%. The remaining 12.8%, or 5.5% is uncovered. I call this the worrisome percentage.

As you keep working on your code base, the worrisome percentage fluctuates:

one day you might add a test for some code that was previously uncovered, so the worrisome percentage decreases;
another day you may add some code with no tests, so the percentage increases;
yet another day you may add some more code along with tests, so even though the number of uncovered lines has not changed, it now represents a smaller percentage;

... and it goes on like that.

If the worrisome percentage is high, then you know for sure that you are doing a bad job, but if it is low, it does not necessarily mean that you are doing a good job, because some very important functionality may be left uncovered, and you just do not know. To make matters worse, modern programming languages offer constructs that achieve great terseness of code, meaning that a few uncovered lines may represent a considerable amount of uncovered functionality.

So, each time you look at the worrisome percentage, you have to wonder what is in there: are all the important lines covered? are all the uncovered lines okay to be left uncovered?

In order to answer this question, you have to go over every single line of code in the worrisome percentage, and examine it to determine whether it is okay that it is being left uncovered. What you find is, more often than not, the usual suspects:

Some ToString() function which is only used for diagnostics;
Some Equals() and HashCode() functions of some value type which does not currently happen to be used as a key in a hash-map;
Some switch statement with a default clause which can never be reached, and if it was to ever be reached it would throw;

... etc.

So, your curiosity is satisfied, your worries are allayed, and you go back to your usual, day-to-day software development tasks.

Some time later, the worrisome percentage has changed again, prompting the same question: what is being left uncovered now?

Each time you need to have this question answered, you have to re-examine every single line of code in the worrisome percentage. As you do this, you discover that in the vast majority of cases, the lines that you are examining now are the exact same lines that you were examining the previous time you were going through this exercise.

After a while, this starts getting tedious. Eventually, you quit looking. Sooner or later, everyone in the shop quits looking.

The worrisome percentage has now become terra incognita: literally anything could be in there; nobody knows, and nobody wants to know, because finding out is such a dreary chore.

That is not a particularly nice situation to be in.

The solution

So, here is a radical proposition: If you always keep your code coverage at 100%, then the worrisome percentage is always zero, so there is nothing to worry about! How much is it worth for you to be able to sleep well at night?

If the worrisome percentage is normally zero, then each time it rises above zero it represents a definite change in the situation: you used to have everything covered, and now you have something uncovered. This signifies a call to action: the code that is now being left uncovered needs to be examined, and dealt with.

Contrast this with the situation where the worrisome percentage is never zero: no matter how it fluctuates, it never represents an appreciable change in the situation: it always goes from some non-zero number to some other non-zero number, meaning that we used to have some code uncovered, and we still have some code uncovered. There is no actionable item.

By dealing with uncovered code as soon as it gets introduced, you bring the worrisome percentage back to zero, thus achieving the following:

You ensure that each time you have to examine uncovered code, the amount of code is small, and it is necessarily related to changes you just made.
You never find yourself in the unpleasant situation of re-examining code that has been examined before; so, the examination does not feel like a dreary chore.
You ensure that next time the worrisome percentage becomes non-zero, it will represent a new call to action.

The conventional understanding of how to deal with uncovered code is to write tests for it, and that is why achieving 100% code coverage is regarded as onerous; however, there exist alternatives that are much easier. For any given piece of uncovered code, besides the option of Proper Testing, we have two more options: Selective Exclusion, and Artificial Coverage. Let us examine them in detail.

Proper Testing

This means turning uncovered code into covered code by writing actual tests for it. (Duh!)

This is of course the highest quality option, but:

It is not always possible. (You can only write a test for code that is in fact testable.)
It does not always represent the best value for money. (You only want to be writing tests for code that is important enough to warrant testing.)

If you decide to write a test, you can still minimize the effort of doing so, by utilizing certain techniques that I talk about in other posts, such as Approval Testing, Testing with Fakes instead of Mocks, and Incremental Integration Testing.

Selective Exclusion

This means instructing the code coverage analyzer to exclude the uncovered code from the total code coverage percentage calculation.

Code that is not testable, or not important enough to warrant testing, can be moved into a separate module which does not participate in coverage analysis. Alternatively, if the code coverage analysis tool supports it, you may be able to exclude individual methods or entire classes without having to move them to another module.

In the DotNet world, this can be accomplished by marking a class or method with the ExcludeFromCodeCoverage attribute, found in the System.Diagnostics.CodeAnalysis namespace.
In the Java world, IntelliJ IDEA offers a setting for specifying what annotation we want to use for marking methods to be excluded from code coverage, so we can use any annotation we like. (See IntelliJ IDEA can now exclude methods from code coverage.)
Various code coverage analyzers support additional ways of selectively excluding code from coverage.

Some might object that Selective Exclusion constitutes cheating, because it increases our code coverage percentage while in fact a lot of code goes uncovered. This objection is based on a misunderstanding of the goal of achieving 100% code coverage. It falsely assumes that the purpose of 100% code coverage is bragging rights. Nothing could be further from the truth. As previously explained, the purpose of 100% code coverage is:

To ensure that each time we have to examine uncovered code, the amount of code is small, and it is necessarily related to changes that we just made.
To never find ourselves in the unpleasant situation of re-examining uncovered code that has been examined before.
To ensure that each time the code coverage percentage drops below 100% it represents a call to action: look at the uncovered code, do something about it.

Artificial Coverage

This means invoking the code from a test, not in order to test it, but in order to prevent it from negatively affecting the total code coverage percentage.

With the previous two options we should be able to bring the worrisome percentage down to a very small number, like 1 percent. What remains is code such as the unreachable default clause of some switch statement, which should really be excluded from coverage, but it cannot, due to limitations in available tooling:

Programming languages tend to allow attaching attributes or annotations to classes and methods, but not to individual lines of code.
Subsequently, code coverage analyzers cannot be instructed to exclude individual lines from coverage.
You can try moving the problematic line into a separate function, and excluding the function, but you cannot exclude the call to that function, so the problem remains.

The solution in this case is to deliberately cause the uncovered line to be invoked during testing, not in order to test it, but simply in order to prevent it from negatively affecting the total code coverage percentage.

This might sound like cheating, but it is not, because the objective was not to test that line in the first place. Think about it this way: If the tooling supported excluding that individual line from coverage, you would have done so. In doing so, the line would have been prevented from negatively affecting the total code coverage percentage, and it would have gone untested. Since the tooling does not support that, the next thing we can do is to artificially cause that line to be included in coverage. In this case, the result is the same: the line is prevented from negatively affecting the total code coverage percentage, and the line goes untested.

Scenarios

Here is a (hopefully exhaustive) list of all the different scenarios under which code might be left uncovered, and what to do in each case:

The code should really be covered, but you forgot to write tests, or you are planning to write tests in the future.
Go with Proper Testing. Write tests for that code. Not in the future, but now.
The code is not used and there is no plan to use it.
This is presumably code which exists for historical reasons, or for reference, or because it took some effort to write it and you do not want to admit that it was in vain by throwing away the code.
Go with Selective Exclusion.
The code is only used for diagnostics.
The prime example of this is ToString() methods whose sole purpose of existence is to give informative descriptions of our objects while debugging.
Go with Selective Exclusion.
The code is not normally reachable, but it is there in case something unexpected happens.
The prime example of this is switch statements that cover all possible cases and yet also contain a default clause just in case an unexpected value somehow manages to creep in.
Go with Artificial Coverage.
This may require a bit of refactoring to make it easier to cause the problematic switch statement to be invoked with an invalid value. The code most likely throws, so catch the exception and swallow it. You can also assert that the expected exception was thrown, in which case it becomes more like Proper Testing.
The code is reachable but not currently being reached.
This is code which is necessary for completeness, and it just so happens that it is not currently being used, but nothing prevents it from being used in the future. A prime example of this is the Equals() and HashCode() functions of value types: without those functions, a value type is incomplete; however, if the value type does not currently happen to be used as a key in a hash-map, then those functions are almost certainly unused.
In this case, you can go with any of the three options:
- Proper Testing.
- Selective Exclusion.
- Artificial Coverage.
The code is not important enough to have a test for it.
Say you have a function which takes a tree data structure and converts it to text using box-drawing characters to print it nicely as a tree on the console. The function is certainly testable, but is it really worth testing? If it ever draws something wrongly, you will probably notice, and if you do not notice, then maybe it did not matter anyway.
In this case you can go with either of the following:
- Selective Exclusion.
- Artificial Coverage.
The code is practically or literally untestable.
For example:
- If your application has a Graphical User Interface (GUI), you can write automated tests for all of your application logic, but the only practical way to ascertain the correctness of the GUI is to have human eyes staring at the screen. (There exist tools for testing GUIs, but I assess them as woefully impractical and acutely ineffective.)
- If your application controls some hardware, you may have a hardware abstraction layer with two implementations, one which emulates the hardware, and one which interacts with the actual hardware. The emulator will enable you to test all of your application logic without having the actual hardware in place; however, the implementation which interacts with the actual hardware is practically untestable by software alone.
- If you have a piece of code that queries the endianness of the hardware architecture and operates slightly differently depending on it, the only path you can truly cover is the one for the endianness of the hardware architecture you are actually using. (You can fake the endianness query, and pretend that your hardware has the opposite endianness, but then you end up with machine words that are unusable on your own architecture, so your tests cannot continue.)
In all of the above cases, and in all similar cases, we have one option:
- Selective Exclusion.

Conclusion

A code coverage percentage of 100% is very useful, not for bragging, but for easily maintaining certainty that everything that ought to be tested is in fact being tested.

Achieving a code coverage percentage of 100% does require some effort, but with techniques such as Selective Exclusion and Artificial Coverage the effort can be reduced to manageable levels.

Ideally, Artificial Code Coverage should not be necessary, but it is a practical workaround for the inability of coverage tools to selectively exclude individual lines of code from code coverage analysis.

Appendix: Future tooling improvements

Fine-grained exclusion

Programming languages should be offering the ability to attach attributes or annotations to individual lines of code. That would give us the ability to instruct code coverage analysis tools to exclude individual lines from coverage. That would make artificial code coverage unnecessary.

Expectation management

When the programmer specifies that a particular piece of code should be excluded from coverage, what the programmer is actually doing is stating their expectation that the code will not be invoked during testing. Unfortunately, code coverage analysis tools simply ignore that code. This is wrong. The tools should instead be verifying the programmer's expectation: If the code was indeed not invoked during testing, then everything is fine. But if the code was in fact invoked, despite the programmer's stated expectation that it will not be invoked, then the tooling should issue an error.

Cover image by Patrick Robert Doyle from Unsplash