Sprouting inner classses

Since returning to a more hands-on technical role, I’ve noticed a few habits that I didn’t realise I had, or have more recently acquired. One of these particular habits is (credit to Michael Feathers for the pattern name) is my tendency to sprout tiny classes.

Perhaps it’s my aversion to writing too much procedural code in an object-oriented language or it’s my preference to write small, well encapsulated objects. So far, my general approach seems to be:

I first notice a particular class’ responsibility has grown too much. An easy temptation is to move it to a function, and if I need to share it, might even be tempted to make it static. Instead, I sprout a static inner class who now owns that responsibility. I move any state needed for that responsibility to the class, keeping the tell, don’t ask principle in tact as much as possible. As I interact with the object more, I refine the class name, seeking to understand its responsibilities in different contexts. I might push more responsibilities into it, or move some responsibilities away from it.

I like to try to keep the class private until I’m happy that the class makes sense and all the responsibilities relate to each other in some logical manner. When I’m confident that the class is mature enough, I elevate it to a top level class.

What approaches do other people tend to favour? What can be improved? What doesn’t make sense?

Wordpress 2.6.2 interoperability with Apache 2 Mod Proxy

I’ve had some issues upgrading to Wordpress 2.6.2. Until recently I had been running 2.5 because I continuously received the error message, “Firefox has detected that the server is redirecting the request for this address in a way that will never complete.” I tried all the suggestions on the forums, yet nothing seemed to help with that. Both Wordpress 2.6 and 2.6.1 exhibited this behaviour. Fortunately 2.6.2 never gave me this problem, instead giving me other problems. Looking at the source code, I was suspicious when the install.php had references to localhost:11008 when I viewed the source.

<link rel="stylesheet"
href="http://localhost:11008/trial/wp-admin/css/install.css?ver=20080708"
type="text/css" media="all"/>

When I ran the installer, it also put that value in for the siteurl option. I looked at the source code and became suspicious of the code putting it in there, specifically in the function wp_guess_url():

$_SERVER["HTTP_HOST"]

I ran a test php page, and found that it was returning the localhost reference. I did some more searching, and found the following comments,

I believe my server is running Apache Mod Proxy, where my virtual host is forwarded on from the main apache server. I asked my system administrator to play around with the ProxyPreserveHost configuration described here but unfortunately it didn’t seem to have much effect on the $_SERVER['HTT_HOST'] value.

I did a little bit more reading, and tried using a total wide fix described here, effectively executing some code to reset the HTTP_HOST to the HTTP_X_FORWARDED_HOST value yet that also didn’t seem to work.

In my .htaccess file, I had added a line:

php_value auto_prepend_file wordpress_fixes.php

and in a wordpress_fixes.php file, I had:

<?php
if(isset($_SERVER["HTTP_X_FORWARDED_HOST"]))
        $_SERVER["HTTP_HOST"] = $_SERVER["HTTP_X_FORWARDED_HOST"];
?>

yet it didn’t seem to do anything. Instead, I reverted back to changing all the $_SERVER['HTTP_HOST'] values into $_SERVER['HTTP_X_FORWARDED_HOST']

For everyone’s reference, here are all the files and the line numbers I had to change:

  • wp-includes/canonical.php (line 48)
  • wp-includes/cron.php (line 103)
  • wp-includes/feed.php (line 500)
  • wp-includes/functions.php (line 2327)
  • wp-includes/pluggable.php (line 669, line 685)
  • wp-login.php (line 20, line 274, line 275)

It also looks like it fixed the cron issue of publishing future posts, and trackbacks, so apologies if anyone got flooded with those recently.

Appreciating language features

Developing systems takes a very different outlook than it does from developing libraries, and again very different from designing languages. Even though on projects when we have specific coding standards, there’s always often a benefit to supporting much more. I appreciate language designers needing to think at a much broader scale about the realm of possibilities than I normally need to for systems I develop.

My example of this appreciation is when I had to debug some java code via a remote terminal on a box where the unwanted behaviour emerged. I was actually thankful for being able to do an import java.util.* rather than having to specify every single class that I wanted to use.

Generating a single fat jar artifact from maven

When using the jar-with-dependencies descriptorRef for the maven-assembly-plugin, it creates two files by default, the normal default jar with any library dependencies excluded, and a second jar with all the libraries included appended with “jar-with-dependencies”. Since I find maven help guides unintuitive, it took a while before we found the appendAssemblyId option to turn it off. Here’s the snippet of the pom.xml to create just a single far jar.

<plugin>
  <artifactId>maven-assembly-plugin</artifactId>
  <configuration>
    <appendAssemblyId>false</appendAssemblyId>
    <descriptorRefs>
      <descriptorRef>jar-with-dependencies</descriptorRef>
    </descriptorRefs>
  </configuration>
  <executions>
    <execution>
      <id>make-assembly</id>
      <phase>package</phase>
      <goals>
        <goal>assembly</goal>
      </goals>
    </execution>
  </executions>
</plugin>

Did I miss something in the Does Agile Scale survey?

It’s been a long time since I’ve followed TheRegister, so I was interested when someone sent around a link to a survey about whether or not agile scales. I had to re-read Question 5: What are the most important facilities to have in place? Here are the options:

  • Tools to enable management of project artefacts within sub-projects
  • Tools to enable management of project artefacts across the entire project or programme
  • Tools to support re-use and/or sharing of artefacts between teams
  • Tools to enable collaboration and communication between and within teams
  • Tools to support quality and integrity testing prior to integration
  • Tools to support performance and scalability testing across the integrated project.
  • Other (please state below)

Is it just me, or did the people who created this survey, miss the values emphasised in the Agile Manifesto, Individuals and interactions over processes and tools?

Now I’m not saying that you don’t bother with tools in agile projects, but if you’re talking about most important facilities, how about activities and demonstrated leadership that establish an open culture of teamwork, trust and honest communication?

Maven-assembly-plugin ignoring manifestEntries?

We’re using this Maven plugin to generate a fat jar for a utility, effectively including all library dependencies un-jarred and re-jarred into a single distribution. The first part was easy, hooking the assembly goal of the maven-assembly-plugin onto the package goal in the maven build lifecycle. Our pom.xml had this entry in it

<plugin>
  <artifactId>maven-assembly-plugin</artifactId>
  <configuration>
    <descriptorRefs>
      <descriptorRef>jar-with-dependencies</descriptorRef>
    </descriptorRefs>
    <archive>
      <manifest>
        <mainClass>com.thekua.maven.ExampleProgram</mainClass>
      </manifest>
    </archive>
  </configuration>
  <executions>
    <execution>
      <phase>package</phase>
      <goals><goal>assembly</goal></goals>
    </execution>
  </executions>
</plugin>

We tried adding in our own entry into the manifest, the CruisePipelineLabel, with a value that should be set by Cruise. We added the new section so our pom.xml now looked like this:

<plugin>
  <artifactId>maven-assembly-plugin</artifactId>
  <configuration>
    <descriptorRefs>
      <descriptorRef>jar-with-dependencies</descriptorRef>
    </descriptorRefs>
    <archive>
      <manifest><mainClass>com.thekua.maven.ExampleProgram</mainClass></manifest>
        <manifestEntries>
          <CruisePipelineLabel>
            ${env.CRUISE_PIPELINE_LABEL}
          </CruisePipelineLabel>
        </manifestEntries>
    </archive>
  </configuration>
  <executions>
    <execution>
      <phase>package</phase>
      <goals><goal>assembly</goal></goals>
    </execution>
  </executions>
</plugin>

After running the target and inspecting the manifest.mf, I couldn’t see the additional property set. I did some searching, found this bug apparently fixed in the 2.2-beta-2 version. After some debugging, I found out that the plugin apparently does not include these additional entries if the value is not set. I tested this out by changing the line to:

<manifestEntries>
  <CruisePipelineLabel>aTestValue</CruisePipelineLabel>
</manifestEntries>

So the answer to whether or not maven-assembly-plugin ignores an element in the manifestEntries is to ensure the value is set before testing it. It looks like a null value is interpreted as “don’t include”.

The world mocks too much

One of my biggest annoyances when it comes to testing is when anyone reaches for a mock out of habit. The “purists” that I prefer to call “zealots”, are overwhelming in numbers, particularly in the TDD community (you do realise you can test drive your code without mocks?) Often I see teams use excuses like, “I only want to test a single responsibility at a time.” It sounds reasonable, yet it’s generally the start of something more insidious, one where suddenly mocks become the hammer and people want to apply it to everything. I sense it through signals like when people say “I need to mock a class”, or “I need to mock a call to a superclass”. Alternatively it’s quite obvious when the number of “stub” or “ignored” calls outnumber the “mocked” calls by an order of magnitude.

Please don’t confuse my stance with “classical state based testing purists” who don’t believe in mocking. Though Martin Fowler describes two, apparently opposing, styles to testing, Classical and Mockist Testing, I don’t see them as mutually exclusive options, rather two ends of a sliding scale. I typical gauge risk as a factor for determining which approach to take. I believe in using the right tool for the right job (easy to say, harder to do right), and believe in using mocking to give me the best balance of feedback when things break, enough confidence to refactor with safety, and as a tool for driving out a better design.

Broken Glass

Image of broken glass taken from Bern@t’s flickr stream under the creative commons licence

Even though I’ve never worked with Steve or Nat, the maintainers of JMock, I believe my views align quite strongly with theirs. When I used to be on the JMock mailing list, it fascinated me to see how many of their responses focused on better programming techniques rather than caving into demands for new features. JMock is highly opinionated software and I agree with them that you don’t want to make some things too easy, particularly those tasks that lend themselves to poor design.

Tests that are difficult to write, maintain, or understand are often a huge symptom for code that is equally poorly designed. That’s why that even though Mockito is a welcome addition to the testing toolkit, I’m equally frightened by its ability to silence the screams of test code executing poorly designed production code. The dogma to test a single aspect of every class in pure isolation often leads to excessively brittle, noisy and hard to maintain test suites. Worse yet, because the interactions between objects have been effectively declared frozen by handfuls of micro-tests, any redesign incurs the additional effort of rewriting all the “mock” tests. Thank goodness sometimes we have acceptance tests. Since the first time something is written is often not the best possible design, writing excessively fine grained tests puts a much larger burden on developers who need to refine it in the future.

So what is a better approach?
Interaction testing is a great technique, with mocks a welcome addition to a developer’s toolkit. Unfortunately it’s difficult to master all aspects including the syntax of a mocking framework, listening to the tests, and responding to appropriately with refactoring or redesign. I reserve the use of mocks for three distinct purposes:

  1. Exploring potential interactions – I like to use mocks to help me understand what my consumers are trying to do, and what their ideal interactions are. I try to experiment with a couple of different approaches, different names, different signatures to understand what this object should be. Mocks don’t prevent me from doing this, though it’s my current preferred tool for this task.
  2. Isolating well known boundaries – If I need to depend on external systems, it’s best to isolate them from the rest of the application under a well defined contract. For some dependencies, this may take some time to develop, establish, stabilise and verify (for which I prefer to use classical state based testing). Once I am confident that this interface is unlikely to change, then I’m happy to move to interaction testing for these external systems.
  3. Testing the boundaries of a group of collaborators – It’s often a group of closely collaborating objects to provide any useful behaviour to a system. I err towards using classical testing to test the objects in that closely collaborating set, and defer to mocking at the boundary of these collaborators.

Automated Acceptance Tests: What are they good for?

A long time ago, I wrote about the differences between the types of tests I see, yet a lot of people don’t appreciate where acceptance tests fit in. I won’t be the first to admit that at first glance, automated acceptance tests seem to have lots of problems. Teams constantly complain about the build time, the maintenance burden they bring, and the difficulty of writing them in the first place. Developers specifically complain about duplicating their effort at different levels, needing to write assertions more than once (at a unit or integration level), and that they don’t get too much value from from them.

I wish everyone would realise that tests are an investment, a form of insurance against risk (the system not working), and most people don’t even know what risk level they are willing to take. That’s why, on experimental code (i.e. a spike), I don’t believe in doing test driven development. I’ve been fortunate enough, or is that learn some lessons from, seeing both extremes.

Maintaining a system only with acceptance tests (end to end)
I worked on one project where the architect banned unit tests, and we were only allowed integration and acceptance tests. He believed (rightly so) that acceptance tests let us change the design of the code without breaking functionality. His (implicit) choice of insurance was a long term one – ensure the state of the system constantly works over time, even with major architectural redesign. During my time on this project, we even redesigned some major functionality to make the system more understandable and maintainable. I doubt that without the acceptance tests, we would have had the confidence to move so quickly. The downside to this style of testing is that the code-test feedback was extremely slow. It was difficult to get any confidence that a small change was going to work. It felt so frustrating to move at such a slow pace without any faster levels of feedback.

Scenario driven acceptance tests (opposed to the less maintainable, story-based acceptance tests) also provide better communication for end users of the system. I’ve often used them as a tool for aiding communication with end users or customer stakeholders to get a better understanding about what it is they think the system should be doing. It’s rare that you achieve the same with unit or integration tests because they tell you more how a particular aspect is implemented, and rarely lacks the system context acceptance tests have.

Maintaining a system only with unit tests
On another project, I saw a heavy use of mocks, and unit tests. All the developers moved really fast, enjoyed refactoring their code, yet on this particular project, I saw more and more issues where basic problems meant that starting up the application failed because all those tiny, well refactored objects just didn’t play well together. Some integration tests caught some of these, but I felt like this project could have benefited from at least a small set of acceptance tests to prevent the tester continuously deploying a broken application despite a set of passing unit tests.

What is your risk profile when it comes to testing?
I think every team developer must understand that different types of test give us different levels of feedback (see the testing aspect to the V-Model), and each has a different level of cost determined by constraints of technology and tools. You’re completely wrong if you declare all automated acceptance tests bad, or all unit tests are awful. Instead you want to choose the right balance of tests (the insurance) that match system’s constraints for its entire lifetime. For some projects, it may make more sense to invest more time in acceptance tests because the risk of repeated mistakes is significantly costly. For others, the cost of manual testing mixed with the right set of unit and integration tests may make more sense. Appreciate the different levels of feedback tests bring, and understand the different levels of confidence over the lifetime of the system, not just the time you spend on the project.