The intersection of technology and leadership

Category: Testing (Page 3 of 3)

Controlling Time

Disclaimer: This technique does not work outside of programming. Do not try this on your neighbours, kids or pets…

What’s wrong with time dependent tests?
It’s easy to write tests that are far too flaky and intermittent. Worse yet, a quick fix often results in putting a pause to tests that make them drag out longer and longer. Alternatively, unneeded complexity is added to try to do smart things to poll for tests to time out.

What can we do about it?
I’m all about solutions, so I’m out about to outline the secret to controlling time (in at least unit tests). Here’s a situation you might recognise. Or not. Sorry to anyone vegetarian reading this entry.

We have a class called Beef that knows when it’s past its prime using the Joda Time libraries.

package com.thekua.examples;

import org.joda.time.DateTime;

public class Beef {
	private final DateTime expiryDate;

	public Beef(DateTime expiryDate) {
		this.expiryDate = expiryDate;
	}
	
	public boolean isPastItsPrime() {
		DateTime now = new DateTime(); // Notice this line?
		return now.isAfter(expiryDate); 
	}
}

Surprise, surprise. We also have a unit test for it:

package com.thekua.examples;

import static org.junit.Assert.assertTrue;
import org.joda.time.DateTime;
import org.junit.Test;

public class BeefTest {
	@Test
	public void shouldBePastItsPrimeWhenExpiryDateIsPast() throws Exception {
		int timeToPassForExpiry = 100;
		
		Beef beef = new Beef(new DateTime().plus(timeToPassForExpiry));
		
		Thread.sleep(timeToPassForExpiry * 2); // Sleep? Bleh...
		
		assertTrue(beef.isPastItsPrime());
	}
}

Step 1: Contain time (in an object of course)
The first step is to contain all the use of time concepts behind an object. Don’t even try to call this class a TimeProvider. It’s a Clock okay? (I’m sure I used to call it that in the past as well, so don’t worry!). The responsibility of the Clock is to tell us the time. Here’s what it looks like:

package com.thekua.examples;

import org.joda.time.DateTime;

public interface Clock {
	DateTime now();
}

In order to support the system working as normally, we are going to introduce the SystemClock. I sometimes call this a RealClock. It looks a bit like this:

package com.thekua.examples;

import org.joda.time.DateTime;

public class SystemClock implements Clock {
	public DateTime now() {
		return new DateTime();
	}
}

We are now going to let our Beef now depend on our Clock concept. It should now look like this:

package com.thekua.examples;

import org.joda.time.DateTime;

public class Beef {
	private final DateTime expiryDate;
	private final Clock clock;

	public Beef(DateTime expiryDate, Clock clock) {
		this.expiryDate = expiryDate;
		this.clock = clock;
	}
	
	public boolean isPastItsPrime() {
		DateTime now = clock.now();
		return now.isAfter(expiryDate); 
	}
}

If you wanted to, the step by step refactoring would look like:

  1. Replace new DateTime() with new SystemClock().now()
  2. Replace new instance with field
  3. Instantiate new field in constructor

We’d use the RealClock in both the code that creates the Beef as well as our test. Our test should look like…

package com.thekua.examples;

import static org.junit.Assert.assertTrue;
import org.junit.Test;

public class BeefTest {
	@Test
	public void shouldBePastItsPrimeWhenExpiryDateIsPast() throws Exception {
		int timeToPassForExpiry = 100;
		SystemClock clock = new SystemClock();
		
		Beef beef = new Beef(clock.now().plus(timeToPassForExpiry), clock);
		
		Thread.sleep(timeToPassForExpiry * 2);
		
		assertTrue(beef.isPastItsPrime());
	}
}

Step 2: Change the flow of time in tests
Now that we have the production code dependent on an abstract notion of time, and our test still working, we now want to substitute the RealClock with another object that allows us to shift time for tests. I’m going to call it the ControlledClock. Its responsibility is to control the flow of time. For the purposes of this example, we’re only going to allow time to flow forward (and ensure tests use relative times instead of absolute). You might vary it if you needed very precise dates and times. Note the new method forwardTimeInMillis.

package com.thekua.examples;

import org.joda.time.DateTime;

public class ControlledClock implements Clock {
	private DateTime now = new DateTime();

	public DateTime now() {
		return now;
	}	
	
	public void forwardTimeInMillis(long milliseconds) {
		now = now.plus(milliseconds);
	}
}

Now we can use this new concept in our tests, and replace the way that we previously forwarded time (with the Thread.sleep) with our new class. Here’s what our final test looks like now:

package com.thekua.examples;

import static org.junit.Assert.assertTrue;
import org.junit.Test;

public class BeefTest {
	@Test
	public void shouldBePastItsPrimeWhenExpiryDateIsPast() throws Exception {
		int timeToPassForExpiry = 100;
		ControlledClock clock = new ControlledClock();
		
		Beef beef = new Beef(clock.now().plus(timeToPassForExpiry), clock);
		
		clock.forwardTimeInMillis(timeToPassForExpiry * 2);
		
		assertTrue(beef.isPastItsPrime());
	}
}

We can even further improve this test to be more specific by forwarding time by simply adding one rather than multiplying twice.

Step 3: Save time (and get some real sleep)
Although this is a pretty trivial example of a single use of time dependent tests, it shouldn’t take too much effort to introduce this concept to any classes that depend on time. Not only will you save yourself heart-ache with either flaky, broken tests, but you should also save yourself the waiting time you’d otherwise need to introduce, leading to faster test execution, and that wonderful thing of fast feedback.

Enjoy! Let me know what you thought of this by leaving a comment.

The world mocks too much

One of my biggest annoyances when it comes to testing is when anyone reaches for a mock out of habit. The “purists” that I prefer to call “zealots”, are overwhelming in numbers, particularly in the TDD community (you do realise you can test drive your code without mocks?) Often I see teams use excuses like, “I only want to test a single responsibility at a time.” It sounds reasonable, yet it’s generally the start of something more insidious, one where suddenly mocks become the hammer and people want to apply it to everything. I sense it through signals like when people say “I need to mock a class”, or “I need to mock a call to a superclass”. Alternatively it’s quite obvious when the number of “stub” or “ignored” calls outnumber the “mocked” calls by an order of magnitude.

Please don’t confuse my stance with “classical state based testing purists” who don’t believe in mocking. Though Martin Fowler describes two, apparently opposing, styles to testing, Classical and Mockist Testing, I don’t see them as mutually exclusive options, rather two ends of a sliding scale. I typical gauge risk as a factor for determining which approach to take. I believe in using the right tool for the right job (easy to say, harder to do right), and believe in using mocking to give me the best balance of feedback when things break, enough confidence to refactor with safety, and as a tool for driving out a better design.

Broken Glass

Image of broken glass taken from Bern@t’s flickr stream under the creative commons licence

Even though I’ve never worked with Steve or Nat, the maintainers of JMock, I believe my views align quite strongly with theirs. When I used to be on the JMock mailing list, it fascinated me to see how many of their responses focused on better programming techniques rather than caving into demands for new features. JMock is highly opinionated software and I agree with them that you don’t want to make some things too easy, particularly those tasks that lend themselves to poor design.

Tests that are difficult to write, maintain, or understand are often a huge symptom for code that is equally poorly designed. That’s why that even though Mockito is a welcome addition to the testing toolkit, I’m equally frightened by its ability to silence the screams of test code executing poorly designed production code. The dogma to test a single aspect of every class in pure isolation often leads to excessively brittle, noisy and hard to maintain test suites. Worse yet, because the interactions between objects have been effectively declared frozen by handfuls of micro-tests, any redesign incurs the additional effort of rewriting all the “mock” tests. Thank goodness sometimes we have acceptance tests. Since the first time something is written is often not the best possible design, writing excessively fine grained tests puts a much larger burden on developers who need to refine it in the future.

So what is a better approach?
Interaction testing is a great technique, with mocks a welcome addition to a developer’s toolkit. Unfortunately it’s difficult to master all aspects including the syntax of a mocking framework, listening to the tests, and responding to appropriately with refactoring or redesign. I reserve the use of mocks for three distinct purposes:

  1. Exploring potential interactions – I like to use mocks to help me understand what my consumers are trying to do, and what their ideal interactions are. I try to experiment with a couple of different approaches, different names, different signatures to understand what this object should be. Mocks don’t prevent me from doing this, though it’s my current preferred tool for this task.
  2. Isolating well known boundaries – If I need to depend on external systems, it’s best to isolate them from the rest of the application under a well defined contract. For some dependencies, this may take some time to develop, establish, stabilise and verify (for which I prefer to use classical state based testing). Once I am confident that this interface is unlikely to change, then I’m happy to move to interaction testing for these external systems.
  3. Testing the boundaries of a group of collaborators – It’s often a group of closely collaborating objects to provide any useful behaviour to a system. I err towards using classical testing to test the objects in that closely collaborating set, and defer to mocking at the boundary of these collaborators.

Automated Acceptance Tests: What are they good for?

A long time ago, I wrote about the differences between the types of tests I see, yet a lot of people don’t appreciate where acceptance tests fit in. I won’t be the first to admit that at first glance, automated acceptance tests seem to have lots of problems. Teams constantly complain about the build time, the maintenance burden they bring, and the difficulty of writing them in the first place. Developers specifically complain about duplicating their effort at different levels, needing to write assertions more than once (at a unit or integration level), and that they don’t get too much value from from them.

I wish everyone would realise that tests are an investment, a form of insurance against risk (the system not working), and most people don’t even know what risk level they are willing to take. That’s why, on experimental code (i.e. a spike), I don’t believe in doing test driven development. I’ve been fortunate enough, or is that learn some lessons from, seeing both extremes.

Maintaining a system only with acceptance tests (end to end)
I worked on one project where the architect banned unit tests, and we were only allowed integration and acceptance tests. He believed (rightly so) that acceptance tests let us change the design of the code without breaking functionality. His (implicit) choice of insurance was a long term one – ensure the state of the system constantly works over time, even with major architectural redesign. During my time on this project, we even redesigned some major functionality to make the system more understandable and maintainable. I doubt that without the acceptance tests, we would have had the confidence to move so quickly. The downside to this style of testing is that the code-test feedback was extremely slow. It was difficult to get any confidence that a small change was going to work. It felt so frustrating to move at such a slow pace without any faster levels of feedback.

Scenario driven acceptance tests (opposed to the less maintainable, story-based acceptance tests) also provide better communication for end users of the system. I’ve often used them as a tool for aiding communication with end users or customer stakeholders to get a better understanding about what it is they think the system should be doing. It’s rare that you achieve the same with unit or integration tests because they tell you more how a particular aspect is implemented, and rarely lacks the system context acceptance tests have.

Maintaining a system only with unit tests
On another project, I saw a heavy use of mocks, and unit tests. All the developers moved really fast, enjoyed refactoring their code, yet on this particular project, I saw more and more issues where basic problems meant that starting up the application failed because all those tiny, well refactored objects just didn’t play well together. Some integration tests caught some of these, but I felt like this project could have benefited from at least a small set of acceptance tests to prevent the tester continuously deploying a broken application despite a set of passing unit tests.

What is your risk profile when it comes to testing?
I think every team developer must understand that different types of test give us different levels of feedback (see the testing aspect to the V-Model), and each has a different level of cost determined by constraints of technology and tools. You’re completely wrong if you declare all automated acceptance tests bad, or all unit tests are awful. Instead you want to choose the right balance of tests (the insurance) that match system’s constraints for its entire lifetime. For some projects, it may make more sense to invest more time in acceptance tests because the risk of repeated mistakes is significantly costly. For others, the cost of manual testing mixed with the right set of unit and integration tests may make more sense. Appreciate the different levels of feedback tests bring, and understand the different levels of confidence over the lifetime of the system, not just the time you spend on the project.

Automated story-based acceptance tests lead to unmaintainable systems

Projects where the team directly translates story-level acceptance criteria into new automated test cases set themselves up for a maintenance nightmare. It seems like an old approach (I’m thinking WinRunner-like record-play back scripts), although at least the teams probably feel the pain faster. Unfortunately not many teams seem to know what to do. It sounds exactly like the scenarios that my colleagues, Phillip and Sarah are experiencing or experienced recently.

Diagnosing this style of testing is easy. If I see the story number or reference in the title of the test class or test case name, chances are, your team experiences automated story-based acceptance tests.

Unfortunately the downfall to this approach has more to do with the nature of stories than it does with the nature of acceptance tests (something I’ll get to later). As I like to say, stories represent the system in a certain snapshot of time. The same thing that lets us deliver incremental value in small chunks just doesn’t scale if you don’t consolidate the new behaviour of the system, with its already existing behaviour. For developers, the best analogy is like having two test classes for a particular class, one that reflected the behaviours and responsibilities of the system at the start, and one that represents the new behaviours and responsibilities of the system right now. You wouldn’t do this at a class level, so why should you do it at the system level?

Avoid temporal coupling in the design of your tests. The same poor design principle of relating chunks of code together simply because someone asked for them at the same time, also apply to how you manage your test cases. In terms of automated story-based acceptance tests, avoid spreading similar tests around the system just because they were needed at different times.

What is a better way? A suggested antidote…

On my current project, I have been lucky enough to apply these concepts early to our acceptance test suites. Our standard is to group tests, not based on time, but on similar sets of functionality. When picking up new stories, we see if any existing tests need to change, before adding new ones. The groupings in our system are based on the system level features, allowing us to reflect the current state of the system as succinctly as possible.

The Benefits of Should and Following BDD

On one project I’ve been on, all the tests began with the word “should”. Though not necessarily taking on full on BDD using anything like NBehave or JBehave, I think the simple effect of reframing tests using the word “should” worked wonders on this particular project.

We had quite a lot of people pass through the project over a year and a half, so I found it interesting to see what impact it had on tests. I observed that:

  • People focused on understanding intent first, and followed through with how it was implemented.
  • People had better conversations around differences (the test name said it should be doing this, and it’s actually doing this. They asked questions like, “Is the name of this test wrong, or is the test incorrectly implemented?”
  • The statement is not as strong as assert or must so I felt people could challenge it more, leading to better quality conversations. “I though the domain was supposed to be like this, and the test and the code does this – Am I misunderstanding something?”

I think this subtle focus brought many qualitative benefits that I think a lot of projects could benefit from.

Getting NUnit ASP to work with .Net 2.0

On my current project we’ve been having a pretty good experience leveraging the NUnitASP library for automated testing of our website. In our second week of development, we noticed like many other people, that it is yet fully compatible with .Net 2.0 because of the way that ASP.Net now generates its javascript postback functions.

In the previous .Net version, ASP.Net seemed to generate a very simple function of the form of __doPostBack([targetId], [targetArguments]) that would effectively set two hidden variables (__EVENTTARGET and __EVENTARGUMENT respectively). In the current version, ASP.Net generates a much more complex javascript function (called WebForm_DoPostBackWithOptions) that I think is caused with use of any of the ASP validator components.

One work around that one person (Lance Ahlberg) found was to turn the “CausesValidation” property off for controls but this may or may not suit the way that you are developing your website. Looking at what the javascript generated does, I think that there must have been a better solution so I spent some time delving into the depths of NUnit ASP to find one.

The result is a patch to ControlTester.cs that allows the __EVENTTARGET and __EVENTARGUMENT to still be set by extracting out the appropriate values from the new WebForm_DoPostBackWithOptions javascript function. You can download the patch here but you have to build your own NUnit ASP, or wait until this is integrated with the next release. The ticket for my submission can be found here.

Newer posts »

© 2024 patkua@work

Theme by Anders NorenUp ↑