I have recently started a major stream of work centered on a particular application in the LMAX stack. This application has had plenty of features added to it over the last few years, but nothing has really required an overhaul.
Our work, however, is somewhat more involved; even finishing the simplest of our requirements has been taking a week or so – that’s a long time, for us.
Hitting the buffers
Our method, to begin with, looked something like the following:
- Write acceptance tests for feature (we tend to batch these up – it helps us explore the story)
- Write integration tests for our application, supporting the feature (these usually resemble the ATs)
- Spike implementation within the application
- Use knowledge gained from spike to drive refactoring
- Repeat the last two steps until the ATs and ITs pass
We’re very much in the Kent Beck school of development here:
First refactor the program to make it easy to add the feature, then add the feature
Our problem was that refactoring the program was hard! We discovered that while making the ITs and ATs pass was easy, getting the unit tests to compile and pass was much harder.
This was frustrating; not least because the unit tests were of the overspecified mock interaction sort. If we moved even the smallest piece of validation, anywhere from one to a hundred unit tests would fail.
Symptom, not cause
We blamed the tests – they were stupid tests, we said; why had anyone bothered to write them? So, we tried to rewrite a couple of unit tests in a more lean style – just creating what we needed to test our new feature.
This felt a lot better right up until we finished, when we looked from our new tests to the old tests, and from the old tests to the new tests; but already it was impossible to say which was which.
These tests were a symptom that the code underneath was jumbled. Someone had attempted to break up large, core domain objects into separate responsibilities by pulling behaviour up into ‘processor’ objects, which had made things smaller but also broke encapsulation. More on this another day.
This was novel – here was a case where the wrong refactoring had painted us into a corner. The problematic tests this ill judged refactoring wrought besmirched all attempts to escape to a better place.
Declaring unit test bankruptcy
We decided to remove these unit tests. They were creating a catch 22 situation: we couldn’t refactor the code without breaking the tests, and we couldn’t make the tests better without fixing the code.
We ended up working like this:
- Write acceptance tests for feature
- Write one integration test for the application (a deliberately smaller step)
- Spike implementation within the application
- Run unit tests with spike code to detect pain
- Rewrite those unit tests as integration tests
- Delete the painful unit tests
- Revert the spike, and use knowledge gained from spike to drive refactoring
- Make new integration tests pass with well factored code
- Continue until all the ATs pass
This allows us to make swingeing refactorings safely; speeding our journey towards a place where we may one day be able to TDD all the way down.
We were lucky to have:
- A mature integration tests framework. This made writing the new tests to assert only on the I/O events from our particular application easy.
- A single threaded application (so integration tests are almost as quick as unit tests, and they don’t suffer from races)
- Extensive AT coverage over the system as a whole.
Beware though, for these are double edged swords. Perhaps it is because our framework makes ATs and ITs so easy to create that we neglected the factoring of the code within.
It seems we have been guilty of declaring stories done when the acceptance tests all pass. If only life were that simple!
In TDD, at the unit level, the method is as follows:
- (write new test) Red
- (make test pass) Green
Here, ‘refactor’ is usually removal of duplication, and separation of responsibilities into separate classes.
We need to execute the refactor step ‘all the way up’.
The refactor step for ITDD and ATDD
I wrote a sort of checklist of things that I think about; but they were:
- too specific
- probably wrong
Instead, I advise instead that all one needs to do at this point is to stop and think. More specific advice is left as an exercise to the reader. (Hint: Think of the principles you apply at the unit level – can you scale them up to the level of systems and applications?)
- Listen to your tests! We could have avoided this whole affair if we had listened to the tests at the time of writing.
- Make sure your definition of done includes the ‘refactor’ step.