Incremental Integration Testing

Incremental integration testing logo by michael.gr

A new method for Automated Software Testing is presented as an alternative to Unit Testing. The new method retains the benefit of Unit Testing, which is Defect Localization, but eliminates the need for mocking, thus greatly lessening the effort of writing and maintaining tests.

(Useful pre-reading: About these papers)


Unit Testing aims to achieve Defect Localization by replacing the collaborators of the Component Under Test with Mocks. As we will show, the use of Mocks is laborious, complicated, over-specified, presumptuous, and constitutes testing against the implementation, not against the interface, thus leading to brittle tests that hinder refactoring rather than facilitating it. To avoid these problems, a new method is proposed, called Incremental Integration Testing. The new method allows each component to be tested in integration with its collaborators, (or with Fakes thereof,) thus completely abolishing Mocks. Defect Localization is achieved by arranging the order in which tests are executed so that the collaborators of a component get tested before the component gets tested, and stopping as soon as a defect is encountered. Thus, when a test discovers a defect, we can be sufficiently confident that the defect lies in the component being tested, and not in any of its collaborators, because by that time all of its collaborators have passed their tests.


The goal of automated software testing is to exercise a software system under various usage scenarios to ensure that it meets its requirements and that it is free from defects. The most simple and straightforward way to achieve this is to set up some input, invoke the system to perform a certain job, and then examine the output to ensure that it is what it is expected to be. 

Unfortunately, this approach only really works in the "sunny day" scenario: if no defects are discovered by the tests, then everything is fine; however, if defects are discovered, we are faced with a problem: the system consists of a large network of collaborating software components, and the test is telling us that there is a defect somewhere, but it is unclear in which component the problem lies. Even if we divide the system into subsystems and try to test each subsystem separately, each subsystem may still consist of many components, so the problem remains.

What it ultimately boils down to is that each time we test a component, and a defect is discovered, it is unclear whether the defect lies in the component being tested, or in one or more of its collaborators.

Ideally, we would like each test to be conducted in such a way as to detect defects specifically in the component that is being tested, instead of extraneous defects in its collaborators; in other words, we would like to achieve Defect Localization.

The existing solution: Unit Testing

Unit Testing (wikipedia) was invented specifically in order to achieve defect localization. It takes an extremely drastic approach: if the use of collaborators introduces uncertainties, one way to eliminate those uncertainties is to eliminate the collaborators. Thus, Unit Testing aims to test each component in strict isolation. Hence, its name.

To achieve this remarkably ambitious goal, Unit Testing refrains from supplying the component under test with the actual collaborators that it would normally receive in a production environment; instead, it supplies the component under test with specially crafted substitutes of its collaborators, otherwise known as test doubles. There exist a few different kinds of substitutes, but by far the most widely used kind is Mocks.

Each Mock must be hand-written for every individual test that is performed; it exposes the same interface as the real collaborator that it substitutes, and it expects specific methods of that interface to be invoked by the component-under-test, with specific argument values, sometimes even in a specific order of invocation. If anything goes wrong, such as an unexpected method being invoked, an expected method not being invoked, or a parameter having an unexpected value, the Mock fails the test. When  the component-under-test invokes one of the methods that the Mock expects to be invoked, the Mock does nothing of the sort that the real collaborator would do; instead, the Mock is hard-coded to yield a fabricated response which is intended to exactly match the response that the real collaborator would have produced if it was being used, and if it was working exactly according to its specification.

Or at least, that is the intention.

Drawbacks of Unit Testing

  • Complex and laborious
    • In each test it is not enough to simply set up the input, invoke the component, and examine the output; we also have to anticipate every single call that the component will make to its collaborators, and for each call we have to set up a mock, expecting specific parameter values, and producing a specific response aiming to emulate the real collaborator under the same circumstances. Luckily, mocking frameworks lessen the amount of code necessary to accomplish this, but no matter how terse the mocking code is, the fact still remains that it implements a substantial amount of functionality which represents considerable complexity.
    • One of the well-known caveats of software testing at large (regardless of what kind of testing it is) is that a test failure does not necessarily indicate a defect in the production code; it always indicates a defect either in the production code, or in the test itself. The only way to know is to troubleshoot. Thus, the more code we put in tests, and the more complex this code is, the more time we end up wasting in chasing and fixing bugs in the tests themselves rather than in the code that they are meant to test.
  • Over-specified
    • Unit Testing is concerned not only with what a component accomplishes, but also with every little detail about how the component goes on about accomplishing it. This means that when we engage in Unit Testing we are essentially expressing all of our application logic twice: once with production code expressing the logic in imperative mode, and once more with testing code expressing the same logic in expectational mode. In both cases, we write copious amounts of code describing what should happen in excruciatingly meticulous detail.
    • Note that with Unit Testing, over-specification might not even be goal in and of itself in some cases, but it is unavoidable in all cases. This is due to the elimination of the collaborators: the requests that the component under test sends to its collaborators could conceivably be routed into a black hole and ignored, but in order for the component under test to continue working so as to be tested, it still needs to receive a meaningful response to each request; thus, the test has to expect each request in order to produce each needed response, even if the intention of the test was not to know how, or even whether, the request is made.
  • Presumptuous
    • Each Unit Test claims to have detailed knowledge of not only how the component-under-test invokes its collaborators, but also how each real collaborator would respond to each invocation in a production environment, which is a highly presumptuous thing to do.
    • Such presumptuousness might be okay if we are building high-criticality software, where each collaborator is likely to have requirements and specification that are well-defined and unlikely to change; however, in all other software, which is regular, commercial, non-high-criticality software, things are a lot less strict: not only the requirements and specifications change all the time, but also quite often, the requirements, the specification, even the documentation, is the code itself, and the code changes every time a new commit is made to the source code repository. This might not be ideal, but it is pragmatic, and it is established practice. Thus, the only way to know exactly how a component behaves tends to be to actually invoke the latest version of that component and see how it responds, while the mechanism which ensures that these responses are what they are supposed to be is the tests of that component itself, which are unrelated to the tests of components that depend on it.
    • As a result of this, Unit Testing often places us in the all too familiar situation where our Unit Tests all pass with flying colors, but our Integration Tests miserably fail because the behavior of the real collaborators turns out to be different from what the mocks assumed it would be.
  • Fragile
    • During testing, if the interactions between the component under test and its collaborators deviate even slightly from our expectations, the test fails. However, these interactions may legitimately change as software evolves, without any changes in the requirements and specification of the software. This may happen for example due to the application of a bug-fix, or simply due to refactoring. With Unit Testing, every time we change the internal behavior of the production code, we have to go fix all the tests to expect the new behavior.
    • The original promise of Automated Software Testing was to enable us to continuously refactor and evolve software without fear of breaking it. The idea is that whenever you make a modification to the software, you can re-run the tests to ensure that you have not broken anything. With Unit Testing this does not work, because every time you change the slightest thing in the production code you have to also change the tests, and you have to do this even if the changes are only internal. The understanding is growing within the software engineering community that Unit Testing actually hinders refactoring instead of facilitating it.
  • Non-reusable
    • Unit Testing exercises the implementation of a component rather than its interface. As such, the Unit Test of a certain component can only be used to test that component and nothing else. Thus, with Unit Testing the following things are impossible:
      • Completely rewrite a piece of production code and then reuse the old tests to make sure that the new implementation works exactly as the old one did.
      • Reuse the same test to test multiple different components that implement the same interface.
      • Use a single test to test multiple different implementations of a certain component, created by independently working development teams taking different approaches to solving the same problem.

    The above disadvantages of Unit Testing are direct consequences of the fact that it is White-Box Testing by nature. What we need to be doing instead is Black-Box testing, which means that Unit Testing should be avoided, despite the entire Software Industry's addiction to it.

    Note that I am not the only one to voice dissatisfaction with Unit Testing with Mocks. People have been noticing that although tests are intended to facilitate refactoring by ensuring that the code still works after refactoring, tests often end up hindering refactoring, because they are so tied to the implementation that you can't refactor anything without breaking the tests. This problem has been identified by renowned personalities such as Martin Fowler and Ian Cooper, and even by Ken Beck, the inventor of Test-Driven Development (TDD).

    In the video Thoughtworks - TW Hangouts: Is TDD dead? (youtube) at 21':10'' Kent Beck says "My personal practice is I mock almost nothing" and at 23':56'' Martin Fowler says "I'm with Kent, I hardly ever use mocks". In the Fragile Test section of his book xUnit Test Patterns: Refactoring Test Code (xunitpatterns.com) author Gerard Meszaros states that extensive use of Mock Objects causes overcoupled tests. In his presentation TDD, where did it all go wrong? (InfoQ, YouTube) at 49':32'' Ian Cooper says "I argue quite heavily against mocks because they are overspecified."

    Note that in an attempt to avoid sounding too blasphemous, none of these people call for the complete abolition of mocks, they only warn against the excessive use of mocks. Furthermore, they seem to have little, if anything, to say about any alternative means of achieving defect localization and yet they continue to call what they do Unit Testing despite the fact that they do not seem to be isolating the components under test.

    Ian Cooper even goes as far as to suggest that in the context of Test Driven Development (TDD) the term Unit Testing does not refer to isolating the components under test from each other, but rather isolating the tests from each other. With this mental acrobatic he achieves the best of both worlds: the can disavow mocks while continuing to call what he practices Unit Testing. (Because apparently, to tell people that Unit Testing is wrong is way too blasphemous even for a Software Engineer with a 20 cm long beard, extensive tattoos, and large hollow earrings.) I do fully agree that it is the tests that should be kept isolated, but I consider this re-definition of the term to be arbitrary and unwarranted. Unit Testing has already been defined, its definition is quite inambiguous, and according to this definition, it is problematic; so, instead of trying to change the definition, we must abandon Unit Testing and start doing something else, which requires a new name.

    A new solution: Incremental Integration Testing

    If we were to abandon Unit Testing, then one might ask what should we be doing instead. Obviously, we must somehow continue testing our software, and it would be nice if we can continue to be enjoying the benefits of defect localization.

    As it turns out, eliminating the collaborators is just one way of achieving defect localization; another, more pragmatic approach is as follows:

    Allow each component to be tested in integration with its collaborators, but only after each one of the collaborators has undergone its own testing, and has successfully passed it.
    Thus, any observed malfunction can be attributed with a high level of confidence to the component being tested, and not to any of its collaborators, because the collaborators have already been tested.

    I call this Incremental Integration Testing.

    An alternative way of arriving at the idea of Incremental Integration Testing begins with the philosophical observation that strictly speaking, there is no such thing as a Unit Test; there always exist collaborators which by established practice we never mock and invariably integrate in Unit Tests without blinking an eye; these are, for example:

    • Many of the external libraries that we use.
    • Most of the functionality provided by the Runtime Environment in which our system runs.
    • Virtually all of the functionality provided by the Runtime Library of the language we are using.

    Nobody mocks standard collections such as array-lists, linked-lists, hash-sets, and hash-maps; very few people bother with mocking filesystems; nobody would mock an advanced math library, a serialization library, and the like; even if one was so paranoid as to mock those, at the extreme end, nobody mocks the MUL and DIV instructions of the CPU; so clearly, there are always some things that we take for granted, and we allow ourselves the luxury of taking these things for granted because we believe that they have been sufficiently tested by their respective creators and can be reasonably assumed to be free of defects. 

    So, why not also take our own creations for granted once we have tested them? Are we testing them sufficiently or not?

    Prior Art

    An internet search for "Incremental Integration Testing" does yield some results. An examination of those results reveals that they are referring to some strategy for integration testing which is meant to be performed manually by human testers, constitutes an alternative to big-bang integration testing, and requires full Unit Testing of the traditional kind to have already taken place. I am hereby appropriating this term, so from now on it shall mean what I intend it to mean. If a context ever arises where disambiguation is needed, the terms "automated" vs. "manual" can be used.

    Implementing the solution: the poor man's approach

    As explained earlier, Incremental Integration Testing requires that when we test a component, all of its collaborators must have already been tested. Thus, Incremental Integration Testing necessitates exercising control over the order in which tests are executed.

    Most testing frameworks execute tests in alphanumeric order, so if we want to change the order of execution all we have to do is to appropriately name the tests, and the directories in which they reside.

    For example:

    Let us suppose that we have the following modules:
    Note how the modules are listed alphanumerically, but they are not listed in order of dependency.
    Let us also suppose that we have one test suite for each module. By default, the names of the test suites follow the names of the modules that they test, so again, a listing of the test suites in alphanumeric order does not match the order of dependency of the modules that they test:
    To achieve Incremental Integration Testing, we add a suitably chosen prefix to the name of each test suite, as follows:
    Note how the prefixes have been chosen in such a way as to establish a new alphanumerical order for the tests. Thus, an alphanumeric listing of the test suites now lists them in order of dependency of the modules that they test:

    At this point Java programmers might object that this is impossible, because in Java, the tests always go in the same module as the production code, directory names must match package names, and test package names always match production package names. Well, I have news for you: they don't have to. The practice of doing things this way is very widespread in the Java world, but there are no rules that require it: the tests do not in fact have to be in the same module, nor in the same package as the production code. The only inviolable rule is that directory names must match package names, but you can call your test packages whatever you like, and your test directories accordingly. Java developers tend to place tests in the same module as the production code simply because the tools (maven) have a built-in provision for this, without ever questioning whether there is any actual benefit in doing so. (There isn't. As a matter of fact, in the DotNet world there is no such provision, and nobody complains.) Furthermore, Java developers tend to place tests in the same package as the production code for no purpose other than to make package-private entities of their production code accessible from their tests, but this is testing against the implementation, not against the interface, and therefore misguided. So, I know that this is a very hard thing to ask from most Java programmers, but trust me, if you would only dare to take a tiny step off the beaten path, if you would for just once do something for reasons other than "everyone else does it", you can very well do the renaming necessary to achieve Incremental Integration Testing.

    Now, admittedly, renaming tests in order to achieve a certain order of execution is not an ideal solution. It is awkward, it is thought-intensive since we have to figure out the right order of execution by ourselves, and it is error-prone because there is nothing to guarantee that we will get the order right. That's why I call it "the poor man's solution". Let us now see how all of this could be automated.

    Implementing the solution: the automated approach

    Here is an algorithm to automate Incremental Integration Testing:

    1. Begin by building a model of the dependency graph of the entire software system.
      • This requires system-wide static analysis to discover all components in our system, and all dependencies of each component. I did not say it was going to be easy.
      • The graph should not include external dependencies.
    2. Test each leaf node in the model.
      • A leaf node in the dependency graph is a node which has no dependencies; at this level, a Unit Test is indistinguishable from an Integration Test, because there are no dependencies to either integrate or mock.
    3. If any malfunction is discovered during step 2, then stop as soon as step 2 is complete.
      • If a certain component fails to pass its test, it is counter-productive to proceed with the tests of components that depend on it. Unit Testing seems to be completely oblivious to this little fact; Incremental Integration Testing fixes this.
    4. Remove the leaf nodes from the model of the dependency graph.
      • Thus removing the nodes that were previously tested in step 2, and obtaining a new, smaller graph, where a different set of nodes are now the leaf nodes.
      • The dependencies of the new set of leaf nodes have already been successfully tested, so they are of no interest anymore: they are as good as external dependencies now.
    5. Repeat starting from step 2, until there are no more nodes left in the model.
      • Allowing each component to be tested in integration with its collaborators, since they have already been tested.

    No testing framework that I know of (JUnit, NUnit, etc.) is capable of doing any of the above; for this reason, I have developed a utility which I call Testana, that does exactly that.

    Testana will analyze a system to discover its structure, will analyze modules to discover dependencies and tests, and will run the tests in the right order so as to achieve Incremental Integration Testing. It will also do a few other nice things, like examine timestamps and refrain from running tests whose dependencies have not changed.

    Testana currently supports Java projects under maven, with JUnit-style tests. For more information, see michael.gr - GitHub project: mikenakis-testana.

    What if my dependencies are not discoverable?

    Some very trendy practices of our modern era include:

    • Using scripting languages, where there is no notion of types, and therefore no possibility of discovering dependencies via static analysis.
    • Breaking up systems into separate source code repositories, so there is no single system on which to perform system-wide static analysis to discover dependencies.
    • Incorporating multiple different programming languages in a single system, (following the polyglot craze,) thus hindering system-wide static analysis, since it now needs to be performed across different languages.
    • Making modules interoperate not via normal programmatic interfaces, but instead via various byzantine mechanisms such as REST, whose modus operandi is binding by name, thus making dependencies undiscoverable.

    If you are following any of the above trendy practices, then you cannot programmatically discover dependencies, so you have no way of automating Incremental Integration Testing. Thus, you will have to specify by hand the order in which your tests will run, and you will have to keep maintaining this order by hand.

    Sorry, but retarded architectural choices do come with consequences.

    What about performance?

    One might argue that Incremental Integration Testing does not address one very important issue which is very well taken care of by Unit Testing with Mocks, and that issue is performance:

    • When collaborators are replaced with Mocks, the tests tend to be fast.
    • When actual collaborators are integrated, such as file systems, relational database management systems, messaging queues, and what not, the tests can become very slow.

    Is there anything we can do about this?

    To address the performance issue I recommend the use of Fakes, not Mocks.  For a description of what Fakes are, how they work, and why they are uncontestably preferable over Mocks, please read michael.gr - Software Testing with Fakes instead of Mocks.

    By supplying a component under test with a Fake instead of a Mock we benefit from great performance, while utilizing a collaborator which has already been tested by its creators and can be reasonably assumed to be free of defects. In doing so, we continue to avoid White-Box Testing and we keep defects localized.

    Furthermore, nothing prevents us from having our CI/CD server run the test of each component twice:

    • Once in integration with Fakes
    • Once in integration with the actual collaborators

    This will be slow, but CI/CD servers generally do not mind. The benefit of doing this is that it gives further guarantees that everything works as intended.


    Incremental Integration Testing has the following benefits:

    • It greatly reduces the effort of writing and maintaining tests, by eliminating the need for mocking code in each test.
    • It allows our tests to engage in Black-Box Testing instead of White-Box Testing. For an in-depth discussion of what is wrong with White-Box Testing, please read michael.gr - White-Box vs. Black-Box Testing.
    • It makes tests more effective and accurate, by eliminating assumptions about the behavior of the real collaborators.
    • It simplifies our testing operations by eliminating the need for two separate testing phases, one for Unit Testing and one for Integration Testing.
    • It is unobtrusive, since it does not dictate how to construct the tests, it only dictates the order in which the tests should be executed.

    Disadvantages (and counter-arguments)

    • It assumes that a component which has been tested is free of defects.
      • Argument:
        • A well-known caveat of software testing is that it cannot actually prove that software is free from defects, because it necessarily only checks for defects that we have anticipated and tested for. As Edsger W. Dijkstra famously put it, "program testing can be used to show the presence of bugs, but never to show their absence!'
      • Counter-arguments:
        • I am not claiming that once a component has been tested, it has been proven to be free from defects; all I am saying is that it can reasonably be assumed to be free from defects. Incremental Integration Testing is not meant to be a perfect solution; it is meant to be a pragmatic solution.
        • The fact that testing cannot prove the absence of bugs does not mean that everything is futile in this vain world, and that we should abandon all hope in despair: testing might be imperfect, but it is what we can do, and it is in fact what we do, and practical, real-world observations show that it is quite effective.
        • Most importantly: Any defects in an insufficiently tested component will not magically disappear if we mock that component in the tests of its dependents.
          • In this sense, the practice of mocking collaborators can arguably be likened to Ostrich policy. (Ostrich Policy on Wikipedia).
          • On the contrary, continuing to integrate that component in subsequent tests gives us incrementally more opportunities to discover defects in it.
    • It fails to achieve complete defect localization.
      • Argument:
        • If a certain component has defects which were not detected when it was being tested, these defects may cause tests of collaborators of that component to fail, in which case it will be unclear where the defect lies.
      • Counter-arguments:
        • It is true that Incremental Integration Testing may fall short of achieving defect localization when collaborators have defects despite having already been tested. It is also true that Unit Testing with Mocks does not suffer from that problem when collaborators have defects; but then again, neither does it detect those defects. For that, it is necessary to always follow a round of Unit Testing with a round of Integration Testing. However, when the malfunction is finally observed during Integration Testing, we are facing the exact same problem that we would have faced if we had done a single round of Incremental Integration Testing instead: a malfunction is being observed which is not due to a defect in the root component of the integration, but instead due to a defect in some unknown collaborator. The difference is that Incremental Integration Testing gets us there faster.
        • Let us not forget that the primary goal of software testing is to guarantee that software works as intended, and that defect localization is an important but nonetheless secondary goal. Incremental Integration Testing goes a long way towards achieving defect localization, but it may not achieve it perfectly, in favor of other conveniences, such as making it far more easy to write and maintain tests. So, it all boils down to whether Unit Testing represents overall more or less convenience than Incremental Integration Testing. I assert that Incremental Integration Testing is unquestionably far more convenient than Unit Testing.
    • It only tests behavior; it does not check what is going on under the hood.
      • Argument:
        • With Unit Testing, you can ensure that a certain module not only produces the right results, but also that it follows an expected sequence of steps to produce those results. With Incremental Integration Testing you cannot observe the steps, you can only check the results. Thus, the internal workings of a component might be slightly wrong, or less than ideal, and you would never know.
      • Counter-argument:
        • This is true, and this is why Incremental Integration Testing might be unsuitable for high-criticality software, where White-Box Testing is the explicit intention, since it is necessary to ensure not only that the software produces correct results, but also that its internals are working exactly according to plan. However, Incremental Integration Testing is not being proposed as a perfect solution, it is being proposed as a pragmatic solution: the vast majority of software being developed in the whole world is regular, commercial-grade, non-high-criticality software, where Black-Box Testing is appropriate and sufficient, since all that matters is that the requirements are met. Essentially, Incremental Integration Testing represents the realization that in the general case, tests which worry not only about the behavior, but also about the inner workings of a component, constitute over-engineering. For a more in-depth discussion about this, please read michael.gr - White-Box vs. Black-Box Testing.
        • In order to make sure that everything is happening as expected under the hood, you do not have to stipulate in excruciating detail what should be happening, you do not have to fail the tests at the slightest sign of deviation from what was expected, and you do not have to go fixing tests each time the expectations change. Another way of ensuring the same thing is to simply:
          • Gain visibility into what is happening under the hood.
          • Be notified when something different starts happening.
          • Visually examine what is now different.
          • Vouch for the differences being expected.
    For more details about this, see michael.gr - Collaboration Monitoring.
    • It prevents us from picking a single test and running it.
      • Argument:
        • With Unit Testing, we can pick any individual test and run it. With Incremental Integration Testing, running an individual test of a certain component is meaningless unless we first run the tests of the collaborators of that component.
      • Counter-argument:
        • This is only true if the dependencies have changed.
          • If the dependencies have not changed, then you do not have to re-run their tests, you can simply go ahead and run only the tests of the component that has changed.
          • If the dependencies have changed, then you must in fact run their tests first, otherwise the individual test that you picked to run is meaningless.
        • If you are unsure as to exactly what has changed, or what the dependencies are, then consider using a tool like Testana, which figures all this out for you. See michael.gr - GitHub project: mikenakis-testana.
    • It requires additional tools.
      • Argument:
        • Incremental Integration Testing is not supported by any of the popular testing frameworks, which means that in order to start practicing it, new tools are necessary.
        • Since Incremental Integration Testing is brand new, obtaining such tools might be very difficult, if not impossible.
        • Furthermore, such tooling is going to be non-trivial to build, because it has to do advanced stuff like system-wide static analysis.
      • Counter-argument:
        • My intention is to show the way; if people see the way, the tools will come.
        • If you are using Java with maven and JUnit, there is already a tool that you can use, see michael.gr - GitHub project: mikenakis-testana.
        • Even in lack of tools, it is possible to start experimenting with Incremental Integration Testing by following the poor-man's approach, which consists of simply naming the tests, and the directories in which they reside, in such a way that your existing testing framework will run them in the right order. This approach is described in detail in the corresponding section of this paper.

    Are Mocks good for anything?

    Mocks can be useful in a few scenarios that I can think of:

    • If we want to start testing a component while one or more of its collaborators are not ready yet for integration because they are still in development, and no Fakes of them are available either. Having said that, I must add that once the collaborators (or Fakes thereof) become available, it is best to start using them, and to unceremoniously throw away the Mocks.
    • In high-criticality software, where the specifications of software components are detailed in official documents that very rarely change, every round of Unit Testing is followed by a round of Integration Testing, and the goal usually is to ensure not only that a component exhibits the correct behavior, but also that it interacts with its collaborators exactly as expected. In such cases, Mocks are useful for simulating the behavior of the collaborators strictly as described in their specification documents, regardless of the possibility that the tests of those collaborators may have failed to detect defects in them. (This is somewhat paranoid, but when testing high-criticality software, paranoia is the order of the day.) Having said that, I must add that that even in the case of high-criticality software, the goal of ensuring that a component interacts with its collaborators exactly as expected does not require stipulating these interactions in testing code; it can be achieved in a much more cost-effective way by means of Collaboration Monitoring; see michael.gr - Collaboration Monitoring.
    • When the developers of a certain component do not want the quality and thoroughness of their work to depend on things that they have no control over, such as the time of delivery of collaborators, the quality of their implementation, and the quality of their testing. (In other words, when the developers of a certain component do not trust the developers of its collaborators.) With the use of Mocks we can claim that our module is complete and fully tested, based on nothing but the specification of its collaborators, and we can claim that it should work fine in integration with its collaborators when they happen to be delivered, and if they happen to work according to spec.
    • When the component-under-test produces results by means of invoking collaborators to supply them with the results, those collaborators can be mocked. In this case the use of Mocking does not constitute white-box testing, because any possible implementation of the component-under-test would have to produce those results, regardless of how it works internally.


    Mocking has been such a great hit in the software industry because it accomplishes multiple things at once:

    • It allows components to be tested without the cost of instantiating and invoking their real collaborators.
    • It allows us to inspect the invocations made by the component-under-test to its collaborators, to ensure that every invocation is made exactly as expected.
    • It allows the component-under-test to receive the results that it needs from its collaborators, in order to continue functioning during the test.
    • It allows the component-under-test to receive results without the uncertainties that could be introduced due to bugs in the real collaborators.
    • In reactive programming, where the component-under-test produces output not by means of returning results, but instead by means of sending results to collaborators, it allows us to examine the results to make sure they are correct.

    However, these are distinctly different things, every single one of which can be addressed by other means:

    • The cost of instantiating and invoking the real collaborators is not always prohibitive, so in many cases the real collaborators can in fact be used.  In those few cases where the cost is indeed prohibitive, it can be avoided with the use of Fakes instead of Mocks.
    • Inspecting the invocations made by the component-under-test to its collaborators is in fact bad practice in most cases, because it constitutes white-box testing. In those exceedingly rare cases where it is necessary, it can be achieved via Collaboration Monitoring, in a non-intrusive, non-overly-specified, and much less laborious way.
    • Supplying the component with the results that it needs in order to continue functioning during the tests can be accomplished by wiring the component with the real collaborators or with fakes thereof.
    • Uncertainties can be reduced by making sure that by the time the component-under-test is being tested, all of its collaborators have already been tested.
    • Reactive programming is probably the only case where the use of Mocks can be justified, but note how even in this case Collaboration Monitoring can be used again to verify the results without having to programmatically stipulate exactly how the results should be and without having to fail the tests if the results change.

    Unit Testing was invented in order to achieve defect localization, but as we have shown, it is laborious, complicated, over-specified, presumptuous, and constitutes White-Box Testing. Incremental Integration Testing is a pragmatic approach for non-high-criticality software which achieves defect localization without the use of mocks, and in so doing it greatly reduces the effort of developing and maintaining tests.

    No comments:

    Post a Comment