Making tests less brittle

I do most of my interesting programming at home currently. Limited time in the evenings means that development is very stop-starty: projects get periods of mad enthusiasm and then are dropped for a few months while I concentrate on something else.

In this context I've found that having a large suite of automated tests can be a double edged sword: Usually when I come back to a project after a few months I have slightly different goals and so want to change the codebase, often drastically.

If the tests are tightly coupled to the implementation this adds significant drag, and is sometimes enough to cause me delete a ton of code and start again. Worse than that, occasionally I just stop running tests and hack.

So over the years I've come to spend an increasing amount of effort isolating tests to make them less brittle in the face of change. This is not an exact science but there's often some way to re-gig a test to make it achieve most of the same purpose without being quite so tightly coupled to the implementation. I guess for my purposes I'm advocating mostly blackbox over whitebox testing at some level. Here's a couple of examples:

System test data in 'native' formats

I like to code 'top-down' as it keeps me focused. I usually start with some data that I want presenting or transforming or storing or mining or whatever. I write some small system tests and code down from there.

I learnt this tip the hard way: It's very advantagous to have test data in a native external format that's not likely to change.

On my first data-aggregation project I had most of the tests using an internal triples format of my own design, which meant that when I changed my mind a few months later I had a ton of testdata and tests to change. I ended up deleting a lot of the code and starting again.

The second time I picked up the project I converted all the inline testdata to CSV and JSON and made the tests run an implicit import transform before invoking the top-level functions. The tests became slightly more complex but also less brittle and I'm now much less likely to delete them.

Inputs/outputs as language primitives

Inevitably as a codebase gets bigger I find that adding top-level blackbox tests isn't enough to drive development and that I need whitebox tests at a unit-level to help with algorithmically intense parts of the project. These tests increase motivation and speed up my coding but unfortunately are a lot more brittle during change and tend to be the ones that get deleted first when I come back to a project.

To combat this I often find it's worth refactoring important algorithms into functions that take language-primitive arguments (e.g. ints, lists etc..), separate from the object graph of the application.

  • A totally contrived illustration:
  • Replace:

        Foo::do-something-clever-with-Bar-Objects( objects )
    
    with:
        do-something-clever-with-id->name-pairs( id/name-pairs )
    

    and have 'Foo' callers unpack the Bah objects into an list of id,name pairs before calling the function.

The tests checking 'do-something-clever' functionality are now less coupled to the internal object graph and are passing only the data required to fulfill the operation.

Now this is obviously a tradeoff: The additional unpacking may add overhead (sometimes not). It might make the function interface unnecessarily complicated. Sometimes the tradeoff works well, sometimes it doesn't, but I always at least consider trying to separate out domain-objects from an algorithmically intense function. Often the algorithm is central to the application but the layout and interaction of the object graph is contrived.

--

It might be that I'm missing some important piece of the testing puzzle - I've mostly coded test-first for as long as I can remember but I've always had a mixed relationship with the outcome. Hopefully there's a silver bullet somewhere that I just haven't been told about yet.

whitebox unit tests slow you down

Is it just me or do whitebox unit tests really bog you down?

I do pretty much all my coding in a test-first stylee; it's the only way to code if you're snatching 20mins here and there for spare time projects. Much of the time these tests serve as scaffolding to keep me on the straight and narrow while I bootstrap up some functionality. Unfortunately after they've served this purpose they just sit there like a ball and chain round my leg slowing any future change in direction.

These days I've got into the habit of converting these tests into more stable blackbox functional tests once there's enough actual functionality to support it. Or I just delete them. Life's too short to be worrying about breaking brittle old tests.