Fixing Grails Tests That Pass in Isolation but Fail When Run as a Suite

| Comments

If you’ve got a test that passes when run by itself but fails when run with the rest of the tests in your test suite, you’ve almost definitely got a test pollution problem. Test pollution happens when the results of one test live past the tearDown of a test and impact the results of subsequent tests.

This can happen in any kind of environment that has side-effects, but in Groovy and Grails programming, is most often caused by making changes to a metaClass or singleton instance that you forget to remove when the test is done.

One other place where test pollution often rears it’s head is when a set of tests run fine in development, but when pushed to an integration server it fails. Often, this is because the developer is working on OSX, but the build server is on Linux and those platforms run the tests in a different order.

These issues can be a real pain to track down and the most reliable approach is to figure out what tests run before the failing test (so tests that could be causing the failure) and run them in smaller and smaller batches till you figure out what the minimal set of tests are that, when run with your test, causes the failure to happen. Remove any test and the tests all pass.

In my recent upgrade of an application from grails 1.3.7 to grails 2.0.3 I hit quite a few tests that weren’t properly cleaning up after themselves that needed to be identified. Running the test sets by hand became very tedious and time-consuming, so I automated it with a script.

To use the script, you need to have a file that contains a list of all of your tests in the order you want them to run (the order that is currently failing). The most reliable way to get this is to parse the test reports that grails test-app generates. To make it easy, just stick this in your .bashrc/.zshrc:

function grailsTestOrder() {
    grep testsuite target/test-reports/TESTS-TestSuites.xml | grep -v testsuites | cut -d\  -f8-9 | sed -E 's/name="(.*)" package="(.*)"/\2.\1/' | grep .

alias testorder=grailsTestOrder

It parses the generated TESTS-TestSuites.xml file and emits test class names with their full package, one per line. Now you can run these commands to generate a file called testorder.txt that shows the order that your tests run in.

    grails test-app    # expecting a failure to happen, we just want the test order so we can replicate the issue
    testorder > testorder.txt

Using that file, and the full package and class of the failing test, we can use this script to have it automatically identify the minimal set of failing tests:


#!/usr/bin/env groovy

// this script finds the test class that's causing test pollution in other tests.  Just pass it
// a file name containing the test order that is failing and the particular test that is failing
// it will start doing a binary search by running combinations of tests that occur before the failing test
// and will eventually identify the smallest set of tests that are failing when run together
class BisectGrailsTests {

    String testOrderFilePath
    File testOrderFile
    String pollutedTest
    String testPhase = null
    List lastFailingTestList

    public BisectGrailsTests(args) {

    def parseArgs(args) {
        def cli = new CliBuilder(usage: "bisectGrailsTests.groovy -f testFile.txt -t com.example.FailingTest")

        cli.with {
            f longOpt: 'testOrderFilePath', args: 1, argName: 'filePath', "A file containing the test class names in the order you'd like to run them"
            t longOpt: 'pollutedTest', args: 1, argName: 'pollutedTest', "The test class name that passes when run by itself but fails when the other tests are run"
            p longOpt: 'testPhase', args: 1, argName: 'testPhase', "Only run with a particular test phase (ex: 'unit', 'integration', etc)"

        def options = cli.parse(args)

        if(options.'testOrderFilePath') {
            testOrderFilePath = options.'testOrderFilePath'
            testOrderFile = new File(testOrderFilePath)

        if (options.'pollutedTest') pollutedTest = options.'pollutedTest'

        if (options.'testPhase') {
            testPhase = options.'testPhase'
            println "Only running test phase $testPhase"

        if (!pollutedTest || !testOrderFilePath) {
        } else if (!testOrderFile.exists()) {
            println "Unable to find test order file from given path: $testOrderFilePath"

    void bisectTests() {
        // run test by itself to ensure it passes
        if (!testsPass("solo", [pollutedTest])) {
            println "The passed test ($pollutedTest) fails when run by itself, there isn't test pollution we can detect if it's failing on it's own"
            println "Here's the command to execute to see if you've fixed things:\n" + createCommand(withPollutedTest([pollutedTest]))

        // now recursively run test with other tests and bisect till it fails

    List bisectTests(List testList) {

        // if the test list size is > 1, split into left and right hand sides
        if (testList.size() > 1) {
            List lhs = leftHandSide(testList)
            List rhs = rightHandSide(testList)

            if (!testsPass("left", withPollutedTest(lhs)))  return bisectTests(lhs)

            if (!testsPass("right", withPollutedTest(rhs)))  return bisectTests(rhs)

            if (testsPass("full", withPollutedTest(testList))) {
                println "Unable to find any failing tests with this list, running them all appears to run without issue: \n" + testList.join(" ")
                return null

            println "Unable to find just one test that's causing problems.  Running just with the left hand or right hand side of this narrowed list passes, but the full list fails"
            println "full list (fails): " + testList.join(" ")
            println "left hand side (passes): " + lhs.join(" ")
            println "right hand side (passes): " + rhs.join(" ")
        } else if (!testsPass("suspected", withPollutedTest(testList))){
            println "The test that's causing pollution: " + testList.join(" ") // should only be 1
            println "Here's the command to execute to see if you've fixed things:\n" + createCommand(withPollutedTest(testList))
            return testList
        println "Not sure what's happening, got to this list of tests, but everything passes with this list: \n" + testList.join(" ")

    List leftHandSide(List testList) {
        return testList[0..(testList.size()/2 - 1)]

    List rightHandSide(List testList) {
        return testList[(testList.size()/2)..-1]

    List withPollutedTest(List testList) {
        return [testList, pollutedTest].flatten()

    List getTruncatedTestList() {
        List testList = []

        testOrderFile.eachLine { String line -> testList << line.trim() }

        Integer badTestAt = testList.indexOf(pollutedTest)

        if (badTestAt < 0) {
            // it's not in the list, don't truncate it and we'll just append it on the end when we need it
            return testList
        } else if (badTestAt == 0) {
            println "The bad test is the first test in the file list, it cannot be test pollution that is affecting this test"
        return testList[0..<badTestAt]

    Boolean testsPass(String runName, List testList) {
        String command = createCommand(testList)
        return runCommand(runName, command) == 0

    String createCommand(List testList) {
        String command = "grails test-app "
        if (testPhase) command += "$testPhase: "
        return command + testList.join(" ")

    def out(prefix, message) {
        println("${prefix.padLeft(12, ' ')}: $message")

    Integer runCommand(runName, command) {
        out runName, command
        return command.execute().with { proc ->
   { line -> out runName, line }

new BisectGrailsTests(args).bisectTests()

Save that file in your path and make it executable with chmod +x bisectGrailsTests.groovy. Now, if you run it without any parameters, it will show you the usage instructions:

% bisectGrailsTests.groovy                                                                                                                                                     
usage: bisectGrailsTests.groovy -f testFile.txt -t com.example.FailingTest
 -f,--testOrderFilePath    A file containing the test class names in the order you'd like to run them
 -p,--testPhase           Only run with a particular test phase (ex: 'unit', 'integration', etc)
 -t,--pollutedTest     The test class name that passes when run by itself but fails when the other tests are run

To run it for real, just specify the path to the testorder.txt file and the class (with package) that’s failing:

bisectGrailsTests.groovy -f testorder.txt -t com.yourcompany.TheFailingTests

Optionally, you can also pass it a testPhase (“unit”, “integration”, “functional”) if you only want to run tests from that phase and don’t think the pollution is happening across test phases (this is the most useful when the failure is happening in unit tests).

When the script runs, it goes through the following algorithm to identify the polluting test:

  1. Assert that the pollutedTest does pass when run by itself
  2. Now, identify where the test occurs within the full list of tests and make a list of all test names before that test
  3. Then take the first half of the test list and run them in order followed by the polluted test
  4. If there is no failure in the first half, take the 2nd half of the tests and run those followed by the polluted test
  5. Take whichever half failed (so it contains the polluter) and recursively go back to step 3 with that half of tests, repeat until we find the test that’s causing our test to fail

Once it’s narrowed down the smallest list that’s still causing failures, it will spit out the grails test-app command with the failing tests so it’s easy for you to reproduce.

Now that you know what test is causing the issue, you need to figure out what in that test is the problem. The best approach here is to just start commenting out all of the tests in the suspected polluter and see if it still causes a failure, slowly start putting code back in until you figure out what’s causing the issue. We’re back to manual steps now, but at least you’re working within a single test rather than your entire test suite.