A colleague of mine, Rob Capretto, introduced the idea of a Test Pyramid. Its like a Food Pyramid – everything has its place in moderation, but to stay healthy, there are certain types of tests you should have more of, and others you need only in small amounts. Here’s my pyramid:
- The majority of tests are mock-based unit tests. In a perfect world, these would be all you need. However, due to both gaps in coverage and subtle inter-object interactions, in practice its wise to put your pieces together sometimes in an integration test.
- We have some subsystem integration tests that test parts of the system.
- We have external integration tests that test software & services created by other people/providers that we rely on, and your infacing to them. How important these are depends on how much “surface area” your system has – if its largely glue, these might predominate.
- We have a few whole system functional tests that test the fully operational system as a user might use it. These should be very few in number, and just serve as sanity checks that when everything is combined, there aren’t any weird interactions or incompatiblities that prevent it working as intended.
Two tests Good, Four Tests Better…. NOT!
A key point is that, just like in a Food Pyramid, you can easily have too many functional, integration and even unit tests if you’re not careful. Not only are they expensive to create, but having too many test quickly become a maintenance burden when you refactor software. The longer your projects runs for, the more pain you will feel from maintaing tests, because tests tend to multiply inevitably over the lifetime of software, and they become brittle, decayed and obtuse if uncared for, just like application code.
In fact, you have to be sparing in the creation of tests. More testing is not automatically better.
Its really the same principle as application code: you want the most effective code in the least number of lines. For tests, “effective” means detecting bugs and undesired change, whilst minimally obstructing desired change.
(The title is a reference to a favorite line of of mine from George Orwell’s novel Animal Farm, implying overly simplistic dogma: “four legs good, two legs bad”)
Automated Tests Make Brittle Codebases
A provocative subtitle, but it’s literally true! If you take heavily automated-tested code and change almost anything, the test suite will break somewhere. Of course, the breakage is intentional, but dont forget that autoated testing is a special kind of deliberate & detectable brittleness added to code. That brittleness protects against entropy, but it can also obstruct you when you add desired change to the codebase. Ive experienced cases where a small change applied into a large codebase can require ten times more effort to fix breaking existing tests than it does to add the new change and its unit test.