Unit Testing Grails Services That Use Redis Without Stomping on Data

| Comments

The Grails Redis plugin lets you use Redis as a store for all kinds of data. I find it especially useful as a compliment to a primary datastore. I use Redis for it’s caching and set operation strengths and let relational databases, and other NoSQL databases leverage their strengths.

One limitation of Redis is that there isn’t an embedded version of it that’s easy to spin up and shut down for tests (like H2 is for databases). This can make testing more difficult as your unit tests might stomp on the data that you want to keep in Redis when you’re doing a grails run-app. Your CI server might also have multiple sets of tests running at the same time that could hit race conditions if they’re all trying to make assertions against Redis at the same time.

The solution that I’ve come up with is to leverage Redis Database Numbers. Unlike a SQL database, Redis doesn’t have schemas to partition data. Instead it buckets data into various database numbers. By default, all operations are against database “0”. The default Redis configuration comes with 16 database buckets (numbered 0..15), but most people only use the first one (database 0). We can use those other buckets to hold our test data.

The factory class below creates a new Redis service that’s configured to point to a database number falling within testDatabaseRange (here 8..15). When you ask it for a new instance, it increments a key kept in database 0 and gives you back the modulus value that falls within your specified range. This lets multiple test servers operate against the same Redis instance and makes it so that they don’t use the same database at the same time.

package grails.plugin.redis.test

import grails.plugin.redis.RedisService
import redis.clients.jedis.JedisPool
import redis.clients.jedis.Protocol
import redis.clients.jedis.JedisPoolConfig
import redis.clients.jedis.Jedis

class TestRedisServiceFactory {

    public static Range testDatabaseRange = 8..15

    public static String DEFAULT_TEST_HOST = "localhost"
    public static int DEFAULT_TEST_PORT = Protocol.DEFAULT_PORT

    public static RedisService getInstance(Map configMap = [:]) {
        Map defaultedConfigMap = [
                host: DEFAULT_TEST_HOST,
                port: DEFAULT_TEST_PORT,
                timeout: Protocol.DEFAULT_TIMEOUT,
                password: null,
                poolConfig: [:]
        ] + configMap

        if (!configMap.database) defaultedConfigMap.database = nextTestDatabase(defaultedConfigMap)

        RedisService redisService = new RedisService()
        JedisPoolConfig poolConfig = new JedisPoolConfig()

        defaultedConfigMap.poolConfig.each { configKey, value ->
            poolConfig."$configKey" = value

        JedisPool jedisPool = new JedisPool(
                defaultedConfigMap.host as String,
                defaultedConfigMap.port as Integer,
                defaultedConfigMap.timeout as Integer,
                defaultedConfigMap.password as String,
                defaultedConfigMap.database as Integer

        redisService.redisPool = jedisPool
        return redisService

    public static int nextTestDatabase(Map configMap) {
        Jedis jedis = new Jedis(configMap.host as String, configMap.port as int)
        assert jedis.ping() == "PONG", "Unable to ping redis server at ${configMap.host}:${configMap.port}, make sure you've got redis running"

        // cycle through redis databases so that we won't have collisions where redis tests for 2 CI jobs use the same database #
        Integer counter = jedis.incr("test:nextRedisDatabase")

        return (counter % (testDatabaseRange.max() + 1 - testDatabaseRange.min())) + testDatabaseRange.min()

    private TestRedisServiceFactory() {}

Now you can create a unit test, and get full interaction with Redis, without worrying about deleting data you might want to keep:

class MyServiceUnitTests {

    RedisService redisService = TestRedisServiceFactory.getInstance()
    MyService myService

    @Before public void setUp() {
        redisService.flushDB()  // ensure that we're clean on setUp so no one else pollutes us
        myService = new MyService(redisService: redisService)

    @After public void tearDown() {
        redisService.flushDB() // ensure that we don't leave anything around to pollute anyone else

    @Test void testRedisServiceUnitTest() {
        redisService.foo = "bar"
        assert redisService.foo == "bar"

    @Test void testMyServiceWithRedis() {
        ... // setup, possibly using redisService to insert data into redis that we want as part of test setup

        myService.someMethod()  // some method that calls into myService's redisService

        ... // test assertions, possibly using redisService to assert that expected modifications have been made

The only case where you’d have a collision is if you had more than testDatabaseRange concurrent tests running, or if one server ran through many quick tests, and another had a very slow test. The solution to that is to just increase the number of databases that Redis allocates at startup to something much larger (like 1024) and changing the testDatabaseRange to work with that new range.

Fixing Grails Tests That Pass in Isolation but Fail When Run as a Suite

| Comments

If you’ve got a test that passes when run by itself but fails when run with the rest of the tests in your test suite, you’ve almost definitely got a test pollution problem. Test pollution happens when the results of one test live past the tearDown of a test and impact the results of subsequent tests.

This can happen in any kind of environment that has side-effects, but in Groovy and Grails programming, is most often caused by making changes to a metaClass or singleton instance that you forget to remove when the test is done.

One other place where test pollution often rears it’s head is when a set of tests run fine in development, but when pushed to an integration server it fails. Often, this is because the developer is working on OSX, but the build server is on Linux and those platforms run the tests in a different order.

These issues can be a real pain to track down and the most reliable approach is to figure out what tests run before the failing test (so tests that could be causing the failure) and run them in smaller and smaller batches till you figure out what the minimal set of tests are that, when run with your test, causes the failure to happen. Remove any test and the tests all pass.

Upgrading to Grails 2 Unit Testing

| Comments

Grails 2 has a lot of great new unit testing features that make many test scenarios easier.

The grails documentation does an OK job of describing some of the new features, but there really wasn’t anywhere that I could find that had a comprehensive list of changes you should make to your code when migrating from grails 1.3.X to 2.0.X.

This blog post is the list of changes that I wished I had when I started to migrate our code.

Quick Shell Function to Bootstrap a Gradle Groovy Project

| Comments

Gradle is a great build tool. It’s easy to download and install. If you’re on a mac and have homebrew, it’s as easy as:

brew install gradle

It’s very easy to use with a little experience, but I find having a good starting place to base your work from can help.

Here’s a quick function that I’ve got in my .zshrc to bootstrap a new groovy gradle project in the current directory (it should also work in in a .profile/.bash_profile/.bashrc).

function newgradle() {
    echo "Creating files for new gradle project"

    cat <.gitignore
    cat <build.gradle
apply plugin: 'groovy'
apply plugin: 'idea'

repositories {

dependencies {
    groovy 'org.codehaus.groovy:groovy-all:1.8.6'
    groovy 'org.apache.ivy:ivy:2.2.0'
    testCompile 'junit:junit:4.10'

task createSourceDirs(description : 'Create empty source directories for all defined sourceSets') << {
    sourceSets*.allSource.srcDirs.flatten().each { File sourceDirectory ->
        if (!sourceDirectory.exists()) {
            println "Making \$sourceDirectory"

idea {
    project {
        jdkName = '1.6'
        ipr {
            withXml { provider ->
                provider.node.component.find { it.@name == 'VcsDirectoryMappings' }.mapping.@vcs = 'Git'
    module {
        downloadSources = true
        downloadJavadoc = true

    gradle createSourceDirs

    git init
    ls -a1 && find src    # list all created assets

It creates a build.gradle file ready to work with java and groovy projects, including IDEA integration (just execute gradle idea).

This gives you all of the tasks necessary to compile, jar, test, and distribute your code. For more information, check out the gradle docs on the java, groovy, and idea tasks.

It also creates all the necessary source directories for you and initializes a new git repository (with starting .gitignore file) for you to save your work.

You can easily tweak the build.gradle or .gitignore files to fit your needs. If you don’t use git, you can either delete those lines, or subsitute the equivalent lines for the source control tool you use. These are just a good starting place for me.

Here’s the sample output from the script above:

% mkdir testapp
% cd testapp
% newgradle                                                                   
Creating files for new gradle project
Making /Users/tnaleid/Documents/workspace/testapp/src/main/resources
Making /Users/tnaleid/Documents/workspace/testapp/src/main/java
Making /Users/tnaleid/Documents/workspace/testapp/src/main/groovy
Making /Users/tnaleid/Documents/workspace/testapp/src/test/resources
Making /Users/tnaleid/Documents/workspace/testapp/src/test/java
Making /Users/tnaleid/Documents/workspace/testapp/src/test/groovy


Total time: 2.344 secs
Initialized empty Git repository in /Users/tnaleid/Documents/workspace/testapp/.git/

Then you’ve got all this gradle functionality ready to use:

% gradle tasks                                                           

All tasks runnable from root project

Build tasks
assemble - Assembles all Jar, War, Zip, and Tar archives.
build - Assembles and tests this project.
buildDependents - Assembles and tests this project and all projects that depend on it.
buildNeeded - Assembles and tests this project and all projects it depends on.
classes - Assembles the main classes.
clean - Deletes the build directory.
jar - Assembles a jar archive containing the main classes.
testClasses - Assembles the test classes.

Documentation tasks
groovydoc - Generates Groovydoc API documentation for the main source code.
javadoc - Generates Javadoc API documentation for the main source code.

Help tasks
dependencies - Displays the dependencies of root project 'test'.
help - Displays a help message
projects - Displays the sub-projects of root project 'test'.
properties - Displays the properties of root project 'test'.
tasks - Displays the tasks runnable from root project 'test' (some of the displayed tasks may belong to subprojects).

IDE tasks
cleanIdea - Cleans IDEA project files (IML, IPR)
idea - Generates IDEA project files (IML, IPR, IWS)

Verification tasks
check - Runs all checks.
test - Runs the unit tests.

Other tasks
createSourceDirs - Create empty source directories for all defined sourceSets

Pattern: build: Assembles the artifacts of a configuration.
Pattern: upload: Assembles and uploads the artifacts belonging to a configuration.
Pattern: clean: Cleans the output files of a task.

To see all tasks and more detail, run with --all.


Total time: 4.871 secs

Finding and Purging Big Files From Git History

| Comments

On a recent grails project, we’re using a git repo that was originally converted from a SVN repo with a ton of large binary objects in it (lots of jar files that really should come from an ivy/maven repo). The .git directory was over a gigabyte in size and this made it very cumbersome to clone and manipulate.

We decided to leverage git’s history rewriting capabilities to make a much smaller repository (and kept our previous repo as a backup just in case).

Here are a few questions/answers that I figured out how to answer with git and some shell commands:

What object SHA is associated with each file in the Repo?

Git has a unique SHA that it associates with each object (such as files which it calls blobs) throughout it’s history.

This helps us find that object and decide whether it’s worth deleting later on:

git rev-list --objects --all | sort -k 2 > allfileshas.txt

Take a look at the resulting allfileshas.txt file for the full list.

What Unique Files Exist Throughout The History of My Git Repo?

If you want to see the unique files throughout the history of your git repo (such as to grep for .jar files that you might have committed a while ago):

    git rev-list --objects --all | sort -k 2 | cut -f 2 -d\  | uniq

How Big Are The Files In My Repo?

We can find the big files in our repo by doing a git gc which makes git compact the archive and stores an index file that we can analyse.

Get the last object SHA for all committed files and sort them in biggest to smallest order:

git gc && git verify-pack -v .git/objects/pack/pack-*.idx | egrep "^\w+ blob\W+[0-9]+ [0-9]+ [0-9]+$" | sort -k 3 -n -r > bigobjects.txt

Take that result and iterate through each line of it to find the SHA, file size in bytes, and real file name (you also need the allfileshas.txt output file from above):

for SHA in `cut -f 1 -d\  < bigobjects.txt`; do
echo $(grep $SHA bigobjects.txt) $(grep $SHA allfileshas.txt) | awk '{print $1,$3,$7}' >> bigtosmall.txt

(there’s probably a more efficient way to do this, but this was fast enough for my purposes with ~50k files in our repo)

Then, just take a look at the bigtosmall.txt file to see your biggest file culprits.

Purging a file or directory from history

Use filter-branch to remove the file/directory (replace MY-BIG-DIRECTORY-OR-FILE with the path that you’d like to delete relative to the root of the git repo:

git filter-branch --prune-empty --index-filter 'git rm -rf --cached --ignore-unmatch MY-BIG-DIRECTORY-OR-FILE' --tag-name-filter cat -- --all

Then clone the repo and make sure to not leave any hard links with:

git clone --no-hardlinks file:///Users/yourUser/your/full/repo/path repo-clone-name

You can use this command from the parent directory that contains your git repository and it’s clone to see how much space each of them take, and how much you’ve shrunk the repo in size:

du -s *(/)     # add the -h flag to see the output in human readable size formats, just like ls -lah vs ls -la

With these commands, I was able to reduce the file size of our repo with a few thousand commits down below the size of the checked out repository (more than an order of magnitude smaller). I only removed old binary files, we still have full history for all code files.

How to Use Kdiff3 as a 3-way Merge Tool With Mercurial, Git, and Tower.app

| Comments

There are a few very nice looking, mac-like diff tools for OSX (Kaleidoscope and Changes come to mind), but none for doing “real” merges. By this, I mean real, 3-way merges with all of the information you need in front of you.

There are no good-looking, “mac-like” merge tools, but if you swallow your pride there are a few different options for 3-way merges, including Araxis Merge ($$$!), DiffMerge, DeltaWalker, and FileMerge which comes free with XCode.

I’ve tried them all, and find them all confusing. They all tend use a 3-pane display to do the merging with your file in the left pane, the file you’re merging in the right pane, and the messy half-merged file in the middle.

That’s not enough information.

A 3-way merge actually has four important sources of information:

Speed Up Your Grails / Spring Security Development With an Auto Login Bookmarklet

| Comments

When you’re doing dev on your website, how often do you log in with the same username and password? I bet it’s 20+ of times a day when you’re actively developing.

Having to log in manually impedes development speed.

If you watch what your browser is doing when it’s interacting with a Spring Security application, you’ll see that (by default) it’s POSTing 2 parameters (j_username and j_password) to http://localhost:8080/YOURAPP/j_spring_security_check.

It’s easy to automate the login process with a little bit of vanilla javascript. Edit this javascript url to replace YOURAPP, YOURUSERNAME, and YOURPASSWORD, then make a bookmark out of it in your browser:


Any time you want to log in, just click that bookmark. You’re now fully authenticated and in the app without having to interact with the login page.

Alternatively, if you’re using Google Chrome (or Firefox), you can create a “search engine” associated with a user-defined keyword. Type the keyword in the address bar to launch it.

You can even parameterize it to log in as a variety of users.

Say that you’ve got a number of different test users in your app: “admin”, “joeuser”, “sales”, “finance”, etc. All of the test users have the same password, but different usernames with different roles. If you make the username in the javascript url a “%s”, Chrome will replace that “%s” with your “search term”.

So if your app is “superapp” and all passwords are “password”, you can use this to create a Chrome search engine that lets you login with whatever test user you want


To set it up, go into your preferences (cmd-,) and press the “Manage Search Engines” button.

Then under “Other Search Engines” click in the box to “Add a new search engine”

Name it with your app’s name (“superapp login”), set the keyword to an abbreviation of your app’s name (“sa”), and set the url to the edited javascript command to log in with your app’s url/username/password (potentially with the username as “%s” to parameterize it).

Once you save it, you can then go to your browser’s address bar (cmd-L) and type your abbreviation (“sa”) to get a new “search engine”. Then enter the username you want to log in as.

Hit enter and you’ll automatically be logged in to your app, without having to interact with your normal login page.

Automating this can help to keep you in the zone, especially if you’re using a security framework that allows deep linking.

If deep linking is enabled, the quickest way to get back to the page you’re iterating on after your session has expired (or you’ve bounced the app) is to reload the page. As it’s redirecting you to the login page, go to your address bar (cmd-L), type your keyword (ex: “sa”) and any associated username (ex: “admin”) and hit enter. You’ll be logged in before the login page displays and Spring Security will redirect you back to the page you originally requested.

Better Grails Batch Import Performance With Redis and Jesque

| Comments

A couple of years ago, I put up a well-received blog post on tuning Batch Import Performance with Grails an MySQL.

I’ve recently needed to revisit some batch importing procedures and have acquired a few extra tools in my Grails utility belt since writing that post: Grails Redis and Grails Jesque.

Redis is a very fast key/value store, where the values are not just strings, but are data structures like lists, sets, and hash maps. I’m the main author of the grails redis plugin, and it’s my favorite pragmatic technology of the past few years. If you’re new to Redis, check out the presentation slides I gave at this year’s gr8conf.

Jesque is a Java implementation of Resque. A Redis-backed message queueing system for creating background jobs. The Jesque plugin is fully integrated with Grails and allows you to create worker jobs that are spring injected and have an active hibernate session. Resque was written in Ruby by the folks at GitHub.

This combination makes parallelizing work very easy, as most of the pain of trying to spin off threads in grails is handled for you by Jesque. Yes, there’s GPars, but the threads that it creates aren’t spring injected and don’t have hibernate sessions.

Using Jesque is as simple as:

  1. create a Job class that implements a perform method.
  2. tell Jesque to start up 1..n worker threads that monitor a queue and use your Job to process work
  3. enqueue work on the queue so workers can pick it up

I’ve created a bitbucket repository with all of the source code from the original Batch Import post, as well as with the enhancements below.

The example problem is that there is a Library class that produces metadata for 100,000 books that we want to persist in the database as Book domain objects.

package com.naleid.example

class Book {
    String title
    String isbn
    Integer edition

    static constraints = {

    static mapping = {
        isbn column:'isbn', index:'book_isbn_idx'

The naive way of doing this takes Grails ~3 hours to do the inserts. The original batch performance post showed how to improve this time from 3 hours to 3 minutes with a few Grails and MySQL tweaks.

Using Redis + Jesque to parallelize the task, I’m able to cut that time in half again to a little over 90 seconds on my MacBook Air.

On real-world imports, where there is quite a bit more data and potentially other linked domain objects that can be memoized with the redis-plugin, I’ve seen a >100x speed improvement over the original serial import, even with the tuning tips from my original post.

Install redis and clone the test project from bitbucket to try it yourself. Just grails-run app, go to the running app on localhost and click on the link to the SerialBookController to see the original version, or the ParallelBookController to see the faster Redis+Jesque version. Each will display the length of time they took to do the insert after they’re done.

The ParallelBookController calls bookService.parallelImportBooksInLibrary(). That method spins up a number of worker threads, iterates through the books in the Library and enqueues each one on a Jesque queue. When it’s done iterating through the Library, it tells all the threads to end when they’re done processing all the work:

    def parallelImportBooksInLibrary(library) {
        Integer workerCount = 10
        String queueName = "import:book"
        withWorkers(queueName, BookConsumerJob, workerCount) {
            library.each { Map bookValueMap ->
                String bookValueMapJson = (bookValueMap as JSON).toString()
                jesqueService.enqueue(queueName, BookConsumerJob.simpleName, bookValueMapJson)

    void withWorkers(String queueName, Class jobClass, Integer workerCount = 5, Closure closure) {
        def workers = []
        def fullQueueName = "resque:queue:$queueName"
        try {
            workers = (1..workerCount).collect { jesqueService.startWorker(queueName, jobClass.simpleName, jobClass) }
            // wait for all the work we've generated to be pulled off the queue
            while (redisService.exists(fullQueueName)) sleep(500)
        } finally {
            // all work is off the queue, tell each worker to kill themselves when they're finished

The work queue that persist the Book domain objects in the database are very simple Jesque Job artefacts that are spring injected and have an active hibernate session. They can be of any class type. The only requirement is that they have a method named perform that is called and passed an item of work from the queue.

Here’s the example BookConsumerJob class that persists a Book to the database:

package com.naleid.example

import grails.converters.JSON

class BookConsumerJob {
    def bookService

    void perform(String bookJson) {

You can see how simple the BookConsumerJob class is. It also calls out to the same bookService method that the serial batch import calls to import a Book.

One other neat thing about using Jesque is that it adheres to the Resque conventions for what gets stored in Redis. This means that you can gem install resque-web and then launch resque-web to get a nice monitoring platform for your Jobs and to see errors, or how much work is left in the queue.

Using Dropbox to Share (Most of) Your Home Directory Across Multiple Computers

| Comments

I’m a very happy customer of Dropbox. It allows painless syncing of files across multiple computers without extra features to complicate it. The top rated answer on Quora to the question “Why is Dropbox more popular than other programs with similar functionality?” sums things up perfectly.

One of my favorite uses of Dropbox is to sync almost all of the non-machine specific configuration files and directories in my home directory across all my OSX computers (currently my iMac, MacBook Air, and my work laptop).

Doing this lets me make a configuration change to one computer and have it almost instantly available on any other computer without any manual steps.

This is especially important for my zshell and Vim configurations as I’m always tweaking those, but it’s also helpful to have my Documents, Downloads and Pictures shared.

I have a folder in my Dropbox directory called home, I use a script called link.sh to automatically create symlinks in my home directory to the things I’ve got stored in Dropbox.

Dropbox/home currently has these files and directories in it:

Desktop-starling.local/   # unique Desktop for my MacBook Air
Desktop-kestrel.local/    # unique Desktop for my iMac
Desktop-thrush.local/     # unique Desktop for my work MacBook Pro

My Dropbox/home directory also has a shell script in it called link.sh:

#! /usr/bin/env bash
cd $(dirname $0)

function linkFile() {
    if [ -z $LINK_TO_NAME ]; then
    if [ -a $HOME/$LINK_TO_NAME ]; then
        echo "**** Found existing $LINK_TO_NAME, skipping..."
    elif [ -h $HOME/$LINK_TO_NAME ]; then
        echo "Already symlinked $LINK_TO_NAME, skipping..."
        echo "Linking $1 to $LINK_TO_NAME"
        ln -s $PWD/$1 $HOME/$LINK_TO_NAME 

for F in $(ls -a1 | grep -v link.sh | grep -v Desktop | egrep -v "^..?$" | egrep -v "^.*un~$" | grep -v .DS_Store); do
    linkFile $F

export HOSTNAME=$(hostname)

if [ -d "Desktop-$HOSTNAME" ]; then
    linkFile "Desktop-$HOSTNAME" "Desktop"
    echo "Unable to find Desktop-$HOSTNAME to link to Desktop"

What the script does is:

  1. cd into the directory that the script is located in (it only symlinks files in the same directory)
  2. list out all of the files and directories in the same directory as the script
  3. filter out the things we don’t want to link (like ., .., the link.sh script itself, etc)
  4. For all of the files/directories that pass the filter, call linkFile to create a symlink in the current user’s home directory as long as there isn’t already a file or a symlink there
  5. Then look for a file called Desktop-$HOSTNAME where $HOSTNAME is the name of the current machine and create a ~/Dropbox symlink to it if it’s found.

It should be safe and non-destructive and only create symlinks when there isn’t anything else there with the same name.

I didn’t have my Pictures, Documents, and Downloads in my Dropbox for quite a while and was able to get away with the free 2GB plan. I recently upgraded to a paid Dropbox plan as I wanted those directories shared as well (though I exclude a couple of them from my work MacBook Pro).

For “special” directories like Desktop, Pictures, Documents, and Downloads, I needed to use sudo rm -r [dirname] to remove it before I could create the symlink (BACKUP THE DIRECTORY FIRST).

I’ve been using this for over a year, and haven’t noticed any apps that care that those directories are symlinks.

Also? I have used this shell script many times on my systems, and I think it’s safe, but PLEASE backup before using it, or deleting any directories. An adult crying is not a pretty sight :).