Archive of articles classified as' "redis"

Back home

Better Grails Batch Import Performance with Redis and Jesque

2011/10/13

A couple of years ago, I put up a well-received blog post on tuning Batch Import Performance with Grails an MySQL.

I’ve recently needed to revisit some batch importing procedures and have acquired a few extra tools in my Grails utility belt since writing that post: Grails Redis and Grails Jesque.

Redis is a very fast key/value store, where the values are not just strings, but are data structures like lists, sets, and hash maps. I’m the main author of the grails redis plugin, and it’s my favorite pragmatic technology of the past few years. If you’re new to Redis, check out the presentation slides I gave at this year’s gr8conf.

Jesque is a Java implementation of Resque. A Redis-backed message queueing system for creating background jobs. The Jesque plugin is fully integrated with Grails and allows you to create worker jobs that are spring injected and have an active hibernate session. Resque was written in Ruby by the folks at GitHub.

This combination makes parallelizing work very easy, as most of the pain of trying to spin off threads in grails is handled for you by Jesque. Yes, there’s GPars, but the threads that it creates aren’t spring injected and don’t have hibernate sessions.

Using Jesque is as simple as:

  1. create a Job class that implements a perform method.
  2. tell Jesque to start up 1..n worker threads that monitor a queue and use your Job to process work
  3. enqueue work on the queue so workers can pick it up

I’ve created a bitbucket repository with all of the source code from the original Batch Import post, as well as with the enhancements below.

The example problem is that there is a Library class that produces metadata for 100,000 books that we want to persist in the database as Book domain objects.

package com.naleid.example
 
class Book {
    String title
    String isbn
    Integer edition
 
    static constraints = {
    }
 
    static mapping = {
        isbn column:'isbn', index:'book_isbn_idx'
    }
}

The naive way of doing this takes Grails ~3 hours to do the inserts. The original batch performance post showed how to improve this time from 3 hours to 3 minutes with a few Grails and MySQL tweaks.

Using Redis + Jesque to parallelize the task, I’m able to cut that time in half again to a little over 90 seconds on my MacBook Air.

On real-world imports, where there is quite a bit more data and potentially other linked domain objects that can be memoized with the redis-plugin, I’ve seen a >100x speed improvement over the original serial import, even with the tuning tips from my original post.

Install redis and clone the test project from bitbucket to try it yourself. Just grails-run app, go to the running app on localhost and click on the link to the SerialBookController to see the original version, or the ParallelBookController to see the faster Redis+Jesque version. Each will display the length of time they took to do the insert after they’re done.

The ParallelBookController calls bookService.parallelImportBooksInLibrary(). That method spins up a number of worker threads, iterates through the books in the Library and enqueues each one on a Jesque queue. When it’s done iterating through the Library, it tells all the threads to end when they’re done processing all the work:

    def parallelImportBooksInLibrary(library) {
        Integer workerCount = 10
        String queueName = "import:book"
        withWorkers(queueName, BookConsumerJob, workerCount) {
            library.each { Map bookValueMap ->
                String bookValueMapJson = (bookValueMap as JSON).toString()
                jesqueService.enqueue(queueName, BookConsumerJob.simpleName, bookValueMapJson)
            }
        }
    }
 
    void withWorkers(String queueName, Class jobClass, Integer workerCount = 5, Closure closure) {
        def workers = []
        def fullQueueName = "resque:queue:$queueName"
        try {
            workers = (1..workerCount).collect { jesqueService.startWorker(queueName, jobClass.simpleName, jobClass) }
            closure()
            // wait for all the work we've generated to be pulled off the queue
            while (redisService.exists(fullQueueName)) sleep(500)
        } finally {
            // all work is off the queue, tell each worker to kill themselves when they're finished
            workers*.end(false)
        }
    }

The work queue that persist the Book domain objects in the database are very simple Jesque Job artefacts that are spring injected and have an active hibernate session. They can be of any class type. The only requirement is that they have a method named perform that is called and passed an item of work from the queue.

Here’s the example BookConsumerJob class that persists a Book to the database:

package com.naleid.example
 
import grails.converters.JSON
 
class BookConsumerJob {
    def bookService
 
    void perform(String bookJson) {
        bookService.updateOrInsertBook(JSON.parse(bookJson))
    }
}

You can see how simple the BookConsumerJob class is. It also calls out to the same bookService method that the serial batch import calls to import a Book.

One other neat thing about using Jesque is that it adheres to the Resque conventions for what gets stored in Redis. This means that you can gem install resque-web and then launch resque-web to get a nice monitoring platform for your Jobs and to see errors, or how much work is left in the queue.

3 Comments

New Grails Redis Plugin Released

2011/08/8

I released the first version of the Grails Redis Plugin over the weekend. It’s a brand new plugin that takes the place of the previous redis plugin (which is being renamed to “redis-gorm” and has been refactored to use this plugin as a dependency). It’s version is 1.0.0.M7 just so it’s “higher” than the plugin it’s replacing, though I’d probably make it a 0.9 release if I were releasing it under a new name till I get a little more community feedback.

Quick description of what Redis is from the README:

The best definition of Redis that I’ve heard is that it is a “collection of data structures exposed over the network”.

Redis is an insanely fast key/value store, in some ways similar to memcached, but the values it stores aren’t just dumb blobs of data. Redis values are data structures like strings, lists, hash maps, sets, and sorted sets. Redis also can act as a lightweight pub/sub or message queueing system.

Redis is used in production today by a number of very popular websites including Craigslist, StackOverflow, GitHub, The Guardian, and Digg.

The Grails Redis plugin makes a Redis connection pool available (and injectable as a spring bean) to your Grails application.

To install the plugin, just execute this inside your application’s directory:

grails install-plugin redis

It allows you to transparently interact with your Redis instance by automatically handling retrieving a connection from the pool and ensuring that the connection is returned to the pool as it delegates to redis.

// overrides propertyMissing and methodMissing to delegate to redis
def redisService
 
redisService.foo = "bar"   
assert "bar" == redisService.foo   
 
redisService.sadd("months", "february")
assert true == redisService.sismember("months", "february")

One of the plugin’s greatest strengths is in the memoization methods and tag libraries that it adds. It’s a write-through cache (with optional TTL expiration). Before executing the closure/tag, it will check Redis to see if we’ve already calculated that value. If we have, we’ll just return the answer from Redis, otherwise, we’ll calculate it, and save it in Redis for future calls.

service method:

redisService.memoize("user:$userId:helloMessage") {
    // expensive to calculate method that returns a String
    "Hello ${security.currentLoggedInUser().firstName}"
}

taglib:

<redis:memoize key="mykey" expire="3600">
    <!-- 
        insert expensive to generate GSP content here 
 
        taglib body will be executed once, subsequent calls 
        will pull from redis till the key expires
    -->
    <div id='header'>
        ... expensive header stuff here that can be cached ...
    </div>
</redis:memoize>

Check out the full documentation on the github repository.

If you’re new to using Redis with Groovy, I created an introductory post and gave a presentation at gr8conf that are good starting places.

If you use OSX for development, you might also find these instructions for automatically launching Redis on startup with launchd useful.

5 Comments

Groovy Script Using Redis to Pick Conference Lottery Winners

2011/06/28

At the end of gr8conf today there were quite a few door prize giveaways. Winners were picked using a printout with attendees listed in (I’m assuming) random order. The guys running the lottery were going down the list and calling off names.

This was right after my talk on using Redis with Groovy and I thought to myself, “this is a perfect example of where a quick redis script could automate this and make it a bit more groovy”. So I threw together this script in about 15 minutes:
Read the rest of this article »

2 Comments

Redis, Groovy and Grails presentation at gr8conf 2011 and GUM

2011/06/27

A couple of weeks ago, I gave a talk on Redis, Groovy and Grails at the Groovy Users of Minnesota meeting. I’m giving that presentation again tomorrow at gr8conf 2011 and I wanted to post the slides so that people had access to them on the web.

An original version of the keynote file that I presented can be found on my bitbucket account. You can download the repo or grab the raw versions off bitbucket.

6 Comments

Running Redis as a User Daemon on OSX with launchd

2011/03/5

If you’re developing on the mac using redis and want it to start automatically on boot, you’ll want to leverage the OSX launchd system to run it as a User Daemon. A User Daemon is a non-gui program that runs in the background as part of the system. It isn’t associated with your user account. If you only want redis to launch when a particular user logs in, you’ll want to make a User Agent instead.

From the command line, create a plist file as root in the /Library/LaunchDaemons directory with your favorite text editor:

sudo vim /Library/LaunchDaemons/io.redis.redis-server.plist

Paste in the following contents and modify it to point it to wherever you’ve got redis-server installed and optionally pass the location of a config file to it (delete the redis.conf line if you’re not using one):

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
	<key>Label</key>
	<string>io.redis.redis-server</string>
	<key>ProgramArguments</key>
	<array>
		<string>/usr/local/bin/redis-server</string>
		<string>/optional/path/to/redis.conf</string>
	</array>
	<key>RunAtLoad</key>
	<true/>
</dict>
</plist>

You’ll then need to load the file (one time) into launchd with launchctl:

sudo launchctl load /Library/LaunchDaemons/io.redis.redis-server.plist

Redis will now automatically be started after every boot. You can manually start it without rebooting with:

sudo launchctl start io.redis.redis-server

You can also shut down the server with

sudo launchctl stop io.redis.redis-server

Or you could add these aliases to your bash/zsh rc file:

alias redisstart='sudo launchctl start io.redis.redis-server'
alias redisstop='sudo launchctl stop io.redis.redis-server'

If you’re having some sort of error (or just want to watch the logs), you can just fire up Console.app to watch the redis logs to see what’s going on.

4 Comments

Using the Grails BeanBuilder to Set Arbitrary Properties From an External Config

2011/02/25

I’m working with an existing library (Jedis a Redis client library) that has a fairly complicated connection pool config file with a large variety of potential properties that could be worth setting depending on the environment that my Grails app is running in.

I wanted the ability to define the set of properties that I wanted to override in the config file without having to call them all out explicitly in the Spring resources.groovy file. If I missed one, or if the client library that I’m using added a new one that I don’t notice, I don’t want to have to release a new version of the code just to set it.

Grails allows you to load external config files simply by defining a reference to them in Config.groovy (this code is even commented out in the default Config.groovy file that gets generated automatically with a new grails app):

grails.config.locations = ["file:${userHome}/.grails/${appName}-config.groovy"]

After a little playing around with the BeanBuilder syntax, I was able to come up with a solution that lets me set whatever values I want in the Config file and have them set on the bean that I have Spring/Grails build.

If you have a config like this:

foo {
    foo = "bar"
    baz = 4
}

You can populate your resources.groovy with something like this to set whatever
values are set in your config file:

beans = {
    def fooMap = application.config?.foo
 
    fooBean(Foo) {
        fooMap?.each { key, value ->
            delegate.setProperty(key, value)
        }
    }
}

This will make a bean that has it’s `foo` set to “bar” and it’s `baz` set to 4.

Later, if I find that I need to set the `baz` property on the fooBean in production, I just add that in my config file and everything works without any code changes.

4 Comments

Introduction to Using Redis with Groovy

2010/12/28

I’m more excited about Redis than just about any other technology right now.

Redis is an insanely fast key/value store, in some ways similar to memcached, but the values it stores aren’t just dumb blobs of data, but can also be hashes, lists, sets and sorted sets. It provides a number of atomic operations on each of those data types (ex: union and intersection methods on sets) and it has been called a “data structure server”.

It’s used in production today by a number of very popular websites including Craigslist, GitHub, The Guardian, and Digg.

Redis is of particular note for Groovy/Grails developers because it’s development is financially supported by VMWare/SpringSource. Redis was also the first non-relational data store to be officially supported by core Grails developers.

It has excellent documentation and a super simple wire protocol that has made it easy for a ton of client libraries for just about every language to pop up. You could probably write a simple client library in a day. Once you know the commands, you can even interact with it through telnet (though you don’t have to :).

Installing Redis

Read the rest of this article »

8 Comments