In search of something better than Test-Driven Development

My previous post was perhaps a bit symbolic (or plain smart-arse) and in need some some explanation.

In simple terms, it is a lament about the shortcomings off Test- (aka Behavior) Driven Development (TDD or BDD), a testing practice that is widely considered current best practice for software engineering, or least my experience of doing it.

In this method, software is constructed by first building executable tests, or specifications, that function somewhat like moulds in the casting of metals or plastics; they determine form of the software by specifying very precisely how it must behave. This takes more of the time and effort; once the specifications are constructed, it is usually less effort to construct the software to satisfy them.

Its been discovered, by myself empirically, but also by many others, that using many fine-grained, non-overlapping, unit tests yields software and tests that are easier/cheaper to build and maintain. However, an uncanny thing happens to unit tests as they become fine-grained: they start to look eerily similar to, or perhaps mirror images of, the code they specify.

So, the mirrors halves of photo in the previous post represents the two sides of a system (in this case, the text on the sign): the code (on the left), and the tests (on the right). What I was seeking to highlight is how the information content of the system has not been increased by adding the tests. We have said the same thing twice: once, as code, and again, as tests over that code.

What we have added is redundancy, which serves to detect errors, just as redundancy does in error correcting memory or RAID disks.

So, to me, the essence of the TDD practices is actually to ensure redundancy in a system’s specification. When the 2 copies of the specification, Code and Tests, do not agree, we have a test failure that is easily detected.

My problems with present day TDD are that

  1. This redundancy is currently achieved by enormous manual effort. In the most efficient cases Ive seen where system behavior is 100% specified by tests, test code is typically 200% of the bulk of the code it specifies, and it can be far worse, 300-400% is not uncommon. In my experience, that consumes a great deal of effort/time/money.
  2. It seems very difficult to measure or control the level of redundancy in the system introduced by the TDD practices. There seems to be a tendency to prescribe a “one-size-fits-all” approach of 100% unit test redundancy (ie every line, every branch, every data case in code covered by a unit test) as being appropriate for every project.
    In contrast, when we use redundancy to control errors for information storage or transmission, the amount of redundancy is a parameter that we can freely vary, to achieve a trade-off between bulky robustness and lightweight delicateness. We need to find a similar variable dial for software testing practices.
  3. When we develop software in the TDD style, we are writing down the same pieces of information twice in 2 ledgers as we work: Write a test. Write the matching code. Execute and verify matching. Rinse and Repeat.
    For me, that feels tedious and unnatural. My instinct is to write it once, and see it execute, to engage in a flowing conversation with the compiler and the runtime.

Ive been thinking alot lately about what could replace TDD, to give us reliable software at more lightweight redundancy levels, in the 10-50% of code size range, rather than the 100%+ redundancy of textbook TDD. Some early ideas:

  • Static Typing
  • Dependent Types
  • Design by Contract (which seems partly isomorphic to the very powerful Property-Based Testing approaches)
  • Automatic type-driven generation of function inputs, so that function can be executed without explicitly writing tests (The other half of Property-Based Testing).
  • Inline executable Examples, expressed as annotations, that describe valid sample function inputs and corresponding outputs, much as tests do, but much closer to the code.

it should have a title that is “Man having serious doubts about whether saying everything twice is actually the best we software engineers can do”