2024-04-01

Audit Testing

Abstract

An automated software testing technique is presented which spares us from having to stipulate our expectations in test code, and from having to go fixing test code each time our expectations change.

(Useful pre-reading: About these papers)

The Problem

The most common scenario in automated software testing is ensuring that given specific input, a component-under-test produces expected output. The conventional way of achieving this is by feeding the component-under-test with a set of predetermined parameters, obtaining the output of the component-under-test, comparing the output against an instance of known-good output which has been hard-coded within the test, and failing the test if the two are not equal.

This approach works, but it is inefficient, because during the development and evolution of a software system we often make changes to the production code fully anticipating the output of certain components to change. Unfortunately, each time we do this, the tests fail, because they are still expecting the old output. So, each change in the production code must be followed by a round of fixing tests to make them pass. This imposes a considerable burden on the software development process.

Note that under Test-Driven Development things are not any better: first we modify the tests to start expecting the new output, then we observe them fail, then we modify the components to produce the new output, then we watch the tests pass. We still have to stipulate our expectations in test code, and we still have to change test code each time our expectations change, which is inefficient.

Audit Testing is a technique for automated software testing which aims to correct this.

The Solution

Under Audit Testing, the assertions that verify the correctness of the output of the component-under-test are abolished, and replaced with code that simply saves the output to a text file. This file is known as the Audit File.

The test may still fail if the component-under-test encounters an error while producing output, in which case we follow a conventional test-fix-repeat workflow, but if the component-under-test manages to produce output, then the output is saved in the Audit File and the test completes successfully without examining it.

The trick is that the Audit File is saved right next to the source code file of the test, which means that it is kept under Version Control. In the most common case, each test run produces the exact same audit output as the previous run, so nothing changes. If a test run produces different audit output from a previous test run, the tooling alerts the developer, and the Version Control System indicates that the Audit File has been modified and is in need of committing. Thus, the developer cannot fail to notice that the audit output has changed.

The developer can then utilize the "Compare with unmodified" feature of the Version Control System to see the differences between the audit output that was produced by the modified code, and the audit output of the last known-good test run. By visually inspecting these differences, the developer can decide whether they are as expected or not, according to the changes they made in the code.

  • If the observed differences are not as expected, then the developer needs to keep working on their code until they are.
  • If the observed differences are as expected, then the developer can simply commit the new code, along with the new Audit File, and they are done.

This way, the output that is expected from the component-under-test does not have to be hard-coded into tests, the tests do not have to fail each time the output changes, and the developer does not have to go fixing tests each time they make a change that results in different output.

This eliminates a considerable burden from the software development process.

Note that the arrangement is also convenient for the reviewer, who can see both the changes in the code and the resulting changes in the Audit Files.

As an added safety measure, the continuous build pipeline can deliberately fail the tests if an unclean working copy is detected after running the tests, because that would mean that the tests produced different results from what was expected.

Requirements

For Audit Testing to work, our tests must be completely free from any sources of non-determinism, otherwise the Audit Files will be noisy, meaning that they will be exhibiting spurious differences from test run to test run.

Thus, the following known best practices for testing are not just "good to know" anymore, they must be followed thoroughly and unfailingly:

  • Never allow any external factors such as file creation times, IP addresses resolved from DNS, etc. to enter into the tests. Fake your file-system; fake The Internet if necessary. For more information about faking stuff, see michael.gr - Software Testing with Fakes instead of Mocks.
  • Never use real time; always fake the clock, making it start from some arbitrary fixed origin and incrementing by a fixed amount each time it is queried.
  • Never use random numbers or any other construct that utilizes them; if randomness is necessary in some scenario, then fake it using a pseudo-random number generator seeded with a known fixed value. If for some reason you have to use GUIDs/UUIDs, make sure to fake every single instance of them using a deterministic generator.
  • Never allow any multi-threading during testing; all components must be tested while running strictly single-threaded, or at the very least multi-threaded but in lock-step fashion.

Additionally, it is a good idea to replace plain hash maps with sorted or ordered hash maps. Plain hash maps do not exactly introduce non-determinism, but they do sometimes cause greater than expected differences in audit output.  This is because a single key addition or removal may cause re-hashing, which will in turn cause the keys to be enumerated in a completely different order, so instead of seeing a single change in the audit output we may see many changes, and the single change that matters may get lost in the noise.

In short, anything that would cause flakiness in software tests will cause noisiness in Audit Testing.  

Applicability

Audit Testing is most readily useful when the Component Under Test produces results as text, or results that are directly translatable to text. With a bit of effort, any kind of output can be converted to text, so Audit Testing is universally applicable.

Must Audit Files be committed?

It is in theory possible to refrain from storing Audit Files in the source code repository, but doing so would have the following disadvantages:

  • It would deprive the code reviewer from the convenience of being able to see not only the changes in the code, but also the differences that these changes have introduced in the audit output of the test.
  • It would require the developer to always remember to immediately run the tests each time they pull from the source code repository, so as to have the unmodified Audit Files produced locally, before proceeding to make modifications to the code which would further modify the Audit Files.
  • It would make it more difficult for the developer to take notice when the Audit Files change.
  • It would make it more difficult for the developer to see diffs between the modified Audit Files and the unmodified ones.

Of course all of this could be taken care of with some extra tooling. What remains to be seen is whether the effort of developing such tooling can be justified by the benefit of not having to store Audit Files in the source code repository.

Comparison of Workflows

Here is a step-by-step comparison between the modify-test-troubleshoot cycle as conventionally practiced, and the modify-test-troubleshoot cycle with Audit Testing.

Conventional:

  1. Modify the production code and/or the test.
  2. Run the test.
    • If the test passes:
      • Done.
    • If the test fails:
      • If the production code is wrong:
        • Go to step 1.
        • If the test is wrong:
          • Modify the test.
          • Go to step 2.
      Audit Testing:
      1. Modify the production code and/or the test.
      2. Run the test.
        • If the test passes:
          • If the audit file has changed according to our expectations:
            • Done.
            • If the audit file has not changed according to our expectations:
              • Go to step 1.
            • If the test fails:
              • Go to step 1.
          Note that the conventional workflow includes an extra "modify the test" step, which does not exist in the audit-testing workflow.

          Conclusion

          Audit Testing is a universally applicable technique for automated software testing which can significantly reduce the effort of writing and maintaining tests by sparing us from having to stipulate our expectations in test code, and from having to go fixing test code each time our expectations change.




          Cover image: "Audit Testing" by michael.gr.

          No comments:

          Post a Comment