{ by david linsin }

September 15, 2008

How To Do Continuous Integration and TDD?

A couple of months ago I wrote about my problem of writing test cases before the actual implementation. Last week I came across a little problem, which once again, backed up my position on the broken test first approach.

In my current project there's a reasonable development infrastructure. We use maven as our build tool and AnthillPro as continuous integration server. It supports integrating local changes of developers and makes sure all parts of the system work the way they should. By the way, this is my first time working with Anthill and I have to say I'm really disappointed. The only CI server that is probably worse is CruiseControl. I've used JetBrains's TeamCity on my previous project and it is way more sophisticated than AnthillPro, but I digress...

The problem that I faced concerning TDD and CI actually sounds very trivial: when should you check in code?

Let's say you are developing the TDD way. You think about the problem and come up with a neat JUnit test for your class, which by the way doesn't exist yet. The JUnit test makes a nice specification of what needs to be implemented. Unfortunately it's almost 5pm and you want to go home. The problem is that you will be out of office for the next couple of days. One of your peers will have to code the implementation, in order to keep the project going. So are you checking in your code?

The way a CI server usually works is that it checks your version control system every x minutes and gets the changes, triggers your configured build and runs your test cases. It can do a lot more, like fire up some server for integration testing or deploy your application to various test machines, running different operating system. In our case it checks SVN every hour, fetches the change-set and triggers the maven build. That comprises compiling, packaging and running instrumented JUnit tests. Only after each step was successful it shows the comforting green bar on the status page - and that's where the problem begins.

Since Anthill relies on maven to run the JUnit tests, you only see the green light when they all succeed. Maven simply indicates failures by returning -1. You can configure maven to ignore failing JUnit tests, which means it returns 0 and Anhill will show you the nice little green status bar. That's probably not really what continuous integration was introduced for, because you would have to dig into the log files to find out if all your JUnit tests really passed.

Let's get back to our example. Going down the TDD route all you have right now is a broken JUnit test, which surely makes a nice specification, but it doesn't satisfy your CI Server. If you check in your code you are gonna break the build. Basically your are left with the following options:

1. You don't check in your code. Instead you could e.g. email it to your fellow developers.
2. You quickly manage to extract the interface from the test case. Your peers could program against it.
3. You can configure maven to ignore your particular JUnit test. Therefore you'd have to adopt the POM file.
4. You comment out your JUnit annotations or the content of the test methods. That way the test wouldn't fail the build.

All of theses options are flawed not to say broken. The lesser evil would probably be a combination of option 1 and 2. Extract an interface that you can check in without breaking the build and email the JUnit test to your fellow developer to avoid him writing his own test case. It is not a very comforting solution, but I couldn't come up with anything better. Apparently, there is a discrepancy between CI and TDD or am I missing something?

You take the blue pill - the story ends, you wake up in your bed and believe whatever you want to believe. You take the red pill - you stay in Wonderland and I show you how deep the rabbit-hole goes.

Call me a green bar fetish, but I'm going for the blue pill. As I mentioned in the beginning the TDD approach in my opinion is kinda broken, for various reasons. I'm writing my JUnit tests after or sometimes even while I'm working on the implementation and I think that works quite well.

17 comments:

Anonymous said...

...or you use a SCM which supports real branching and avoid the entire headache. :-) I try not to "fan boy" too many tools, especially really immature ones, but Git has the potentially to dramatically ease this sort of situation. By branching you can avoid "change hog" syndrome while simultaneously keeping the blessed branch in perfect working (and test-passing) order. By the time you're ready to merge back into the main line, you can have your tests ready and polished.

Julian said...

How big is the feature that you're working on and how often are you committing? My experience suggests that you should be breaking your implementation down into very granular pieces, writing a test for a smal piece, fixing it, and then checking in. Several times a day.

All the other alternatives have the effect of hiding your good work from your team, which isn't really continuous integration.

David Linsin said...

daniel > or you use a SCM which supports real branching

That's a very interesting idea, unfortunately I'm neither familiar with Git nor am I in power to change the VCS.

julian > ...you should be breaking your implementation down into very granular pieces...All the other alternatives have the effect of hiding your good work from your team, which isn't really continuous integration.

You are right julian, my example is a little exaggerated. Usually you would be working on something longer than a couple of hours and usually you would plan upfront, when something needs to be done.

On the other hand, I was in situations, where I simply checked in interfaces, so people can work with them, because I couldn't finish my stuff in time. Since I'm not practicing TDD, I just thought of how painful it must be to do CI and TDD together.

Gregg Bolinger on DZone > The solution is so very obvious. You stub out the simplest implementation that will allow your test to pass. ... Get to green as fast as possible. That is part of TDD.

Call me petty, but stubbing out the implementation doesn't sound right....but if that's TDD, it's probably a viable solution.

Anonymous said...

Wa also leverage the concept of specifying unit tests. There is a little trick not to break the build with those tests while not breaking the CI: make them abstract.

As you define those spec tests on an interface, those tests access the instance to test via an abstract method getInstance() that returns a concrete instance. This allows your fellow developer - who is responsible to implement the interface - to create a subclass of the test simply implementing getInstance() and returnin its implementation.

From a higher point of view your problem boils down to "I'm not completed but have to go". This is something that has to be handled on working habits level. You have to force yourself to plan time for "completing" your work: polish Javadocs, complete tests, check everthing in. I also struggle sometimes to actually execute this heuristics but to me this is the only way to get it done.

Anonymous said...

You don't seem to have any concept of "build promotion" in your process. It's easy to add in an ad-hoc way in several ways.

If you never want to change the CI server's config (for example, we're using CruiseControl) then each promoted environment would have a svn repo location.

When a dev build has been vetted it's copied to the qa branch. The CI server pulls from the qa branch. When that build is vetted it's copied to a prod branch.

With CC you lose some version info on the dashboard, unless you change the config file (I've been unable to change the build or project name via JMX w/o something bad happening) but IMO that's been a small price to pay.

Anonymous said...

David,

I'm an Anthiller and am sorry to hear your not digging it. I'd love to hear what you're finding frustrating - particularly if you are on one of the recent 3.x versions. If you're still on 2 dot something well... I understand where it might not stand up as well compared to something released in the last year or two and I'd love to see you get upgraded to a recent version.


Anyway, assuming 3.x, you should be able to get passed the worst of Maven obscuring the failure or failing hard by having Maven publish the JUnit results in XML and pointing AnthillPro's JUnit integration at it. Anthill will parse that out and store the results in the database. This should make it pretty easy to see that yes the build passed, but 2 of 2000 tests are failing.

In a TDD environment, this isn't a scenario I would find troubling. When work is in progress TDD expects tests to be failing - it indicates that a feature is planned but not yet implemented. However, I would want to implement a report that showed me tests that were failing for more than a few days and perhaps a report of any newly failing tests in this build.

While we don't want to ship with failing tests, if we discover a bug, I'd much prefer a failing test that exposes the bug be added to the build immediately rather than wait for the patch to enter the test. It's just a more honest representation of the state of the code. The strongest counter-argument, I think, is broken windows theory which I more than respect.

That said, if you want to fail the build on test errors, that's quite doable either through inspecting the tests at the end, or including a post-processor on the Maven execution that looks for output indicating failed tests. You really shouldn't have to be digging into the logs yourself.

Anyway, if you want to let us know why you think Anthill stinks and/or give our guys a chance to help you get the configuration to a place where you might think it's great, send me a mail: etm@urbancode.com.

Oh, and if you get a chance, swap off that hourly build and move to triggering the build straight from SVN with the post-commit hook. It's so much nicer for CI.

Jeffrey Fredrick said...

"Apparently, there is a discrepancy between CI and TDD or am I missing something?"

I think what you're missing is that what you describe is an exceptional case and you can handle it exceptionally.

I think your process and tools should be optimized for the normal every day case, and your scenario is far from the normal case when doing TDD.

David Linsin said...

ollie > There is a little trick not to break the build with those tests while not breaking the CI: make them abstract

Awesome!! Thanks for the advice! That's exactly the comment I was hoping to get with this blog post!

eric minick > I'm an Anthiller and am sorry to hear your not digging it. I'd love to hear what you're finding frustrating

I'll drop you an email and let you know my pain points.

...including a post-processor on the Maven execution that looks for output indicating failed tests. You really shouldn't have to be digging into the logs yourself.

Ahh I thought this should work with a decent CI tool! Thanks for pointing that out!

Jeffrey Fredrick > ...what you describe is an exceptional case and you can handle it exceptionally

See, I think it is not exceptional at all! In fact I think a lot of people are struggling with "practical" problems like this. Problems that might seem minor or fast to solve, but then are really show stoppers in the long run.

"In theory, there is no difference between theory and practice, in practice there is!"

I just find software development full of little annoyances and I'm (still) hoping to find a solution for all of them!

Jeffrey Fredrick said...

"See, I think it is not exceptional at all!"

Really? You often come up with the perfect unit test at 5 pm just before you're out of the office for a few days?

I'll believe you, but at the same time I'll say that would be exceptional in my experience. Not exceptional like never, just in the "rare enough we can special case it".

Or perhaps you mean something else? Perhaps what you mean is that if you add up all the "exceptional" cases the chances of something exceptional happening almost every day are pretty good. Now that I believe... and have experienced first hand!

"I think a lot of people are struggling with "practical" problems like this."

No doubt! But that struggle is the effort of converting theoretical knowledge into experience, and it is common when learning any skill.

But there's something else that's sticking in my craw.

Reading your blog I get the feeling that you are doing three things simultaneously that aren't compatible. You are (1) taking 'practical' short-cuts with the strict TDD process, like having more than one failing test at a time, and (2) objecting to 'broken' options for solving a problem, like emailing someone a test class or having a test temporarily not run, and then (3) claiming a conflict between TDD and CI.

I think doing (1) and (2) should disqualify (3), but that's just IMHO. :)

David Linsin said...

Jeffrey Fredrick > ...just in the "rare enough we can special case it".Or perhaps you mean something else?

Maybe my example was rare enough for you and me to call it a special case. I just think a lot of people run into these rare special cases every single day. Maybe they only experience it once or twice, but I claim it can be frustrating enough to blog about it here.

Jeffrey Fredrick > ...struggle is the effort of converting theoretical knowledge into experience, and it is common when learning any skill.

And if the discussion here can ease the struggle for the next person experience my rare special case, than it was worth blogging about it.

Jeffrey Fredrick > ...claiming a conflict between TDD and CI.

Maybe the conflict only exists in the rare special case that I was dealing with?

Jeffrey Fredrick said...

"...but I claim it can be frustrating enough to blog about it here."

That's something I can certainly understand! :)

"And if the discussion here can ease the struggle for the next person...

Another point on which I fully sympathize. I'm hoping to encourage that next person — and you — to not give up. Yes these things are annoying but I think in time the benefits outweigh the frustrations and more than repay the effort.

Of course I'm very publicly biased on this point. :)

Unknown said...

Hi all!

There's a subtle difference between TDD and test first. In Test First you simply code your test first, and then you go to the implementation. In TDD you must have only one small red light at a time (and a todo list elsewhere...).
Puttng this down in minutes, this means that you shouldn't waste (or risk) more then 5 minutes a day, when committing. Ig the bell ring and you're in a red state... just commit tomorrow.

I'll be more careful in using branches, being on the trunk really enhances code integration (continuous or not) and the overall pace of the team.

Best regards

Rune said...

A late comment, I know, but if you are using junit 4 the @ignore annotation exists for exactly this reason:

http://api.dpml.net/junit/4.2/org/junit/Ignore.html

David Linsin said...

Thx Rune, frankly I only got to know about @Ignore last week.

Jeffrey Fredrick said...

Just make sure you have a handle on how many tests have been @ignored. You want to make sure that number keeps dropping back to zero vs. growing over time...

David Linsin said...

@Jeffrey

Is there tool support for detecting the number of tests tagged with @ignore?

Jeffrey Fredrick said...

I'm not sure but I see that the javadoc or @ignore says that junit runners should report the number of ignored tests:

http://api.dpml.net/junit/4.2/org/junit/Ignore.html

(I've not used @ignore but prior-to JUnit 4 at on one team we'd come up w/our own equivalent and had a problem w/the number of ignored tests growing, until we wrote a test that would fail if the number was greater than 'X', where 'X' would trend to zero.)


com_channels

  • mail(dlinsin@gmail.com)
  • jabber(dlinsin@gmail.com)
  • skype(dlinsin)

recent_postings

loading...