Finding and Purging Big Files From Git History

2012/01/17

On a recent grails project, we’re using a git repo that was originally converted from a SVN repo with a ton of large binary objects in it (lots of jar files that really should come from an ivy/maven repo). The .git directory was over a gigabyte in size and this made it very cumbersome to clone and manipulate.

We decided to leverage git’s history rewriting capabilities to make a much smaller repository (and kept our previous repo as a backup just in case).

Here are a few questions/answers that I figured out how to answer with git and some shell commands:

What object SHA is associated with each file in the Repo?

Git has a unique SHA that it associates with each object (such as files which it calls blobs) throughout it’s history.

This helps us find that object and decide whether it’s worth deleting later on:

git rev-list --objects --all | sort -k 2 > allfileshas.txt

Take a look at the resulting allfileshas.txt file for the full list.

What Unique Files Exist Throughout The History of My Git Repo?

If you want to see the unique files throughout the history of your git repo (such as to grep for .jar files that you might have committed a while ago):

    git rev-list --objects --all | sort -k 2 | cut -f 2 -d\  | uniq

How Big Are The Files In My Repo?

We can find the big files in our repo by doing a git gc which makes git compact the archive and stores an index file that we can analyse.

Get the last object SHA for all committed files and sort them in biggest to smallest order:

git gc && git verify-pack -v .git/objects/pack/pack-*.idx | egrep "^\w+ blob\W+[0-9]+ [0-9]+ [0-9]+$" | sort -k 3 -n -r > bigobjects.txt

Take that result and iterate through each line of it to find the SHA, file size in bytes, and real file name (you also need the allfileshas.txt output file from above):

for SHA in `cut -f 1 -d\  < bigobjects.txt`; do
echo $(grep $SHA bigobjects.txt) $(grep $SHA allfileshas.txt) | awk '{print $1,$3,$7}' >> bigtosmall.txt
done;

(there’s probably a more efficient way to do this, but this was fast enough for my purposes with ~50k files in our repo)

Then, just take a look at the bigtosmall.txt file to see your biggest file culprits.

Purging a file or directory from history

Use filter-branch to remove the file/directory (replace MY-BIG-DIRECTORY-OR-FILE with the path that you’d like to delete relative to the root of the git repo:

git filter-branch --prune-empty --index-filter 'git rm -rf --cached --ignore-unmatch MY-BIG-DIRECTORY-OR-FILE' --tag-name-filter cat -- --all

Then clone the repo and make sure to not leave any hard links with:

git clone --no-hardlinks file:///Users/yourUser/your/full/repo/path repo-clone-name

You can use this command from the parent directory that contains your git repository and it’s clone to see how much space each of them take, and how much you’ve shrunk the repo in size:

du -s *(/)     # add the -h flag to see the output in human readable size formats, just like ls -lah vs ls -la

With these commands, I was able to reduce the file size of our repo with a few thousand commits down below the size of the checked out repository (more than an order of magnitude smaller). I only removed old binary files, we still have full history for all code files.

1 Comment

How to use kdiff3 as a 3-way merge tool with mercurial, git, and Tower.app

2012/01/12

There are a few very nice looking, mac-like diff tools for OSX (Kaleidoscope and Changes come to mind), but none for doing “real” merges. By this, I mean real, 3-way merges with all of the information you need in front of you.

There are no good-looking, “mac-like” merge tools, but if you swallow your pride there are a few different options for 3-way merges, including Araxis Merge ($$$!), DiffMerge, DeltaWalker, and FileMerge which comes free with XCode.

I’ve tried them all, and find them all confusing. They all tend use a 3-pane display to do the merging with your file in the left pane, the file you’re merging in the right pane, and the messy half-merged file in the middle.

That’s not enough information.

A 3-way merge actually has four important sources of information:

  • LOCAL – your file with the changes you’ve made to it
  • REMOTE – the file you’re merging in, possibly authored by someone else
  • BASE – the common ancestor file that LOCAL and REMOTE came from
  • MERGE_RESULT – the file resulting from the merge where you resolve conflicts

You often need to see all four of these pieces of information to make intelligent choices. Where you came from (LOCAL), where the other person’s changes came from (REMOTE), where you both started (BASE) and where you are now (MERGE_RESULT).

Most other 3-way merge tools either conflate or omit the BASE and that can make it harder to see what the right thing to do is.

Kdiff3 is my merge tool of choice. It’s not pretty; it’s cross-platform Qt based so it has a very old-school linux GUI feel to it. But like linux, it’s functional and can help you quickly be productive once you get over the learning curve.

If you’ve got 2 heads that you need to merge in your current repository:

We made this modification to file.txt:

diff --git a/file.txt b/file.txt
--- a/file.txt
+++ b/file.txt
@@ -1,1 +1,1 @@
-BASE: common_ancestor
+LOCAL: file in current working branch

and they made this change to the same file and line:

diff --git a/file.txt b/file.txt
--- a/file.txt
+++ b/file.txt
@@ -1,1 +1,1 @@
-BASE: common_ancestor
+REMOTE: other branch we're merging in

If you run your SCM’s merge command (here, in mercurial: hg merge -r 2), and have kdiff3 configured as your merge tool, you’ll get a pop-up window like this:

As you can see, it shows you all 4 pieces of information, BASE, LOCAL, and REMOTE on top, and the MERGE_RESULT file on the bottom. It currently has a Merge Conflict that you need to fix.

You can move from one unresolved conflict to the next using the triple up and triple-down colored arrows in the middle of the tool bar. When a conflict is highlighted, you can press any combination of the A, B, and C buttons in the toolbar. Pressing one of those buttons will resolve the conflict with the code from pane A, B, or C on top. So if the LOCAL file (your file) had the right changes in it, you’d press B.

It’s possible to press more than one button if code from multiple panes is valid. You can also directly edit the file in the MERGE_RESULT pane to make manual changes if the correct merge is not the exact text in A/B/C.

Another option, if you want to take all of the changes from one file and discard any changes from the others, is to go to the “Merge” menu and pick one of “Choose A Everywhere”, “Choose B Everywhere”, or “Choose C Everywhere”.

Once you’ve resolved your file, simply save it (cmd-S) and quit out of kdiff3. Your SCM should see the MERGE_RESULT no longer has any merge conflicts and will mark it as resolved, ready for you to commit it. If there are other files with merge conflicts, you can repeat the process with those files.

Installing kdiff3

Installing kdiff3 is as easy as downloading the latest version from sourceforge and copying it to your /Applications directory.

The app is a simple wrapper around a Qt-based application. It can be run from the command line at /Applications/kdiff3.app/Contents/MacOS/kdiff3. You can make a symlink of that into your path, but I assume in the instructions below only that you’ve got kdiff3 in your /Applications folder.

Mercurial Integration

Mercurial command line integration is pretty easy. Just open up your ~/.hgrc file (or create one if you don’t have it already), and add this to it:

[extdiff]
cmd.kdiff3 = /Applications/kdiff3.app/Contents/MacOS/kdiff3
 
[merge-tools]
kdiff3.args = $base $local $other -o $output

That configures kdiff3 as your merge tool of choice, so it should pop up automatically when you hit a merge conflict. You can also use it as a diff tool:

hg kdiff3 -r revision_to_compare_to

Git Command-Line Integration

To configure the git command line to use kdiff3 as a diff and merge tool, add this to your ~/.gitconfig:

[difftool "kdiff3"]
    path = /Applications/kdiff3.app/Contents/MacOS/kdiff3
    trustExitCode = false
[difftool]
    prompt = false
[diff]
    tool = kdiff3
[mergetool "kdiff3"]
    path = /Applications/kdiff3.app/Contents/MacOS/kdiff3
    trustExitCode = false
[mergetool]
    keepBackup = false
[merge]
    tool = kdiff3

Now you can use the external tool commands to look at diffs:

git difftool [revision_sha]

and fix unresolved merge conflicts:

git mergetool

Git Tower Integration

I’m normally a pretty hard-core command line user, but I still sometimes find the places I get myself in git confusing enough that I’m willing to use a GUI to get myself out. I think that git-tower is the best git GUI available right now.

It can show you diff files and lets you browse through history in it’s UI. It also lets you do a number of basic and mid-level commands (adding, committing, rebasing, squashing commits, cherry-picking, etc), but it doesn’t have a built-in diff or merge tool.

It does come with the ability to integrate with a few different merge tools out of the box, but not kdiff3.

After a little digging around, I found a generic post showing how configure Tower with a plist file and a shim shell script, but nothing specific to kdiff3.

I figured out how to get Tower and kdiff3 to play nice with each other, and have the instructions and files in a GitHub repository. To get it working, just clone the repository to a temporary directory, and run the install.sh script:

cd /tmp
git clone git@github.com:tednaleid/git-tower-kdiff3-shim.git 
cd git-tower-kdiff3-shim
./install.sh

It simply copies the plist file and the shim script into the appropriate directories (and Tower is very picky about those locations).

Then you need to go into Tower’s Preferences->Git Config control panel, and choose kdiff3 as the diff and merge tool:

For a quick rundown of how Tower works with kdiff3 as a merge tool, here’s an example where we’ve got one commit where we made changes to locally:

And another commit on origin/master where someone else made changes to the same piece of code:

We’ve pulled it down and now have 2 heads within our repository. To fix this, we need to merge and resolve the conflicts, hit the merge button and we’ll get an error message because there are conflicts that git can’t resolve automatically:

If you highlight a file with merge conflicts, and then hit the “Merge Tool” button, it will bring up kdiff3 and let us use it to resolve the issue:

Fix the merge conflicts in kdiff3 (ex: press “C” to change it to “Goodbye Cruel World”), save it, and quit out of kdiff3. Tower should see that the file has had it’s conflicts resolved and lets you commit the merge and carry on.

You can also use Kdiff3 as a regular diff tool if you don’t like looking at diff files. Just choose a file that you’ve modified to diff:

Press the “Diff Tool” button, and it’ll pop up in kdiff3:

Kdiff3 might be homely, but it’s easy to use once you understand how it works.

No Comments

Getting a Clojure REPL in Vim with VimClojure, Nailgun, and Leiningen

2011/12/19

Having a Clojure REPL (Read Eval Print Loop) right inside Vim makes it easier to test ideas, get documentation, and explore your code. There are a few hoops that you need to jump through to enable it, but the payoff is worth it.

Read the rest of this article »

2 Comments

Speed up your Grails / Spring Security Development with an Auto Login Bookmarklet

2011/11/29

When you’re doing dev on your website, how often do you log in with the same username and password? I bet it’s 20+ of times a day when you’re actively developing.

Having to log in manually impedes development speed.

If you watch what your browser is doing when it’s interacting with a Spring Security application, you’ll see that (by default) it’s POSTing 2 parameters (j_username and j_password) to http://localhost:8080/YOURAPP/j_spring_security_check.

It’s easy to automate the login process with a little bit of vanilla javascript. Edit this javascript url to replace YOURAPP, YOURUSERNAME, and YOURPASSWORD, then make a bookmark out of it in your browser:

javascript:(function(){var%20path='http://localhost:8080/YOURAPP/j_spring_security_check';var%20params={'j_username':'YOURUSERNAME','j_password':'YOURPASSWORD'};var%20form=document.createElement("form");form.setAttribute("method","POST");form.setAttribute("action",path);for(var%20key%20in%20params){var%20hiddenField=document.createElement("input");hiddenField.setAttribute("type","hidden");hiddenField.setAttribute("name",key);hiddenField.setAttribute("value",params[key]);form.appendChild(hiddenField);}document.body.appendChild(form);form.submit();}());

Any time you want to log in, just click that bookmark. You’re now fully authenticated and in the app without having to interact with the login page.

Alternatively, if you’re using Google Chrome (or Firefox), you can create a “search engine” associated with a user-defined keyword. Type the keyword in the address bar to launch it.

You can even parameterize it to log in as a variety of users.

Say that you’ve got a number of different test users in your app: “admin”, “joeuser”, “sales”, “finance”, etc. All of the test users have the same password, but different usernames with different roles. If you make the username in the javascript url a “%s”, Chrome will replace that “%s” with your “search term”.

So if your app is “superapp” and all passwords are “password”, you can use this to create a Chrome search engine that lets you login with whatever test user you want

javascript:(function(){var%20path='http://localhost:8080/superapp/j_spring_security_check';var%20params={'j_username':'%s','j_password':'password'};var%20form=document.createElement("form");form.setAttribute("method","POST");form.setAttribute("action",path);for(var%20key%20in%20params){var%20hiddenField=document.createElement("input");hiddenField.setAttribute("type","hidden");hiddenField.setAttribute("name",key);hiddenField.setAttribute("value",params[key]);form.appendChild(hiddenField);}document.body.appendChild(form);form.submit();}());

To set it up, go into your preferences (cmd-,) and press the “Manage Search Engines” button.

Then under “Other Search Engines” click in the box to “Add a new search engine”

Name it with your app’s name (“superapp login”), set the keyword to an abbreviation of your app’s name (“sa”), and set the url to the edited javascript command to log in with your app’s url/username/password (potentially with the username as “%s” to parameterize it).

Once you save it, you can then go to your browser’s address bar (cmd-L) and type your abbreviation (“sa”) to get a new “search engine”. Then enter the username you want to log in as.

Hit enter and you’ll automatically be logged in to your app, without having to interact with your normal login page.

Automating this can help to keep you in the zone, especially if you’re using a security framework that allows deep linking.

If deep linking is enabled, the quickest way to get back to the page you’re iterating on after your session has expired (or you’ve bounced the app) is to reload the page. As it’s redirecting you to the login page, go to your address bar (cmd-L), type your keyword (ex: “sa”) and any associated username (ex: “admin”) and hit enter. You’ll be logged in before the login page displays and Spring Security will redirect you back to the page you originally requested.

2 Comments

Better Grails Batch Import Performance with Redis and Jesque

2011/10/13

A couple of years ago, I put up a well-received blog post on tuning Batch Import Performance with Grails an MySQL.

I’ve recently needed to revisit some batch importing procedures and have acquired a few extra tools in my Grails utility belt since writing that post: Grails Redis and Grails Jesque.

Redis is a very fast key/value store, where the values are not just strings, but are data structures like lists, sets, and hash maps. I’m the main author of the grails redis plugin, and it’s my favorite pragmatic technology of the past few years. If you’re new to Redis, check out the presentation slides I gave at this year’s gr8conf.

Jesque is a Java implementation of Resque. A Redis-backed message queueing system for creating background jobs. The Jesque plugin is fully integrated with Grails and allows you to create worker jobs that are spring injected and have an active hibernate session. Resque was written in Ruby by the folks at GitHub.

This combination makes parallelizing work very easy, as most of the pain of trying to spin off threads in grails is handled for you by Jesque. Yes, there’s GPars, but the threads that it creates aren’t spring injected and don’t have hibernate sessions.

Using Jesque is as simple as:

  1. create a Job class that implements a perform method.
  2. tell Jesque to start up 1..n worker threads that monitor a queue and use your Job to process work
  3. enqueue work on the queue so workers can pick it up

I’ve created a bitbucket repository with all of the source code from the original Batch Import post, as well as with the enhancements below.

The example problem is that there is a Library class that produces metadata for 100,000 books that we want to persist in the database as Book domain objects.

package com.naleid.example
 
class Book {
    String title
    String isbn
    Integer edition
 
    static constraints = {
    }
 
    static mapping = {
        isbn column:'isbn', index:'book_isbn_idx'
    }
}

The naive way of doing this takes Grails ~3 hours to do the inserts. The original batch performance post showed how to improve this time from 3 hours to 3 minutes with a few Grails and MySQL tweaks.

Using Redis + Jesque to parallelize the task, I’m able to cut that time in half again to a little over 90 seconds on my MacBook Air.

On real-world imports, where there is quite a bit more data and potentially other linked domain objects that can be memoized with the redis-plugin, I’ve seen a >100x speed improvement over the original serial import, even with the tuning tips from my original post.

Install redis and clone the test project from bitbucket to try it yourself. Just grails-run app, go to the running app on localhost and click on the link to the SerialBookController to see the original version, or the ParallelBookController to see the faster Redis+Jesque version. Each will display the length of time they took to do the insert after they’re done.

The ParallelBookController calls bookService.parallelImportBooksInLibrary(). That method spins up a number of worker threads, iterates through the books in the Library and enqueues each one on a Jesque queue. When it’s done iterating through the Library, it tells all the threads to end when they’re done processing all the work:

    def parallelImportBooksInLibrary(library) {
        Integer workerCount = 10
        String queueName = "import:book"
        withWorkers(queueName, BookConsumerJob, workerCount) {
            library.each { Map bookValueMap ->
                String bookValueMapJson = (bookValueMap as JSON).toString()
                jesqueService.enqueue(queueName, BookConsumerJob.simpleName, bookValueMapJson)
            }
        }
    }
 
    void withWorkers(String queueName, Class jobClass, Integer workerCount = 5, Closure closure) {
        def workers = []
        def fullQueueName = "resque:queue:$queueName"
        try {
            workers = (1..workerCount).collect { jesqueService.startWorker(queueName, jobClass.simpleName, jobClass) }
            closure()
            // wait for all the work we've generated to be pulled off the queue
            while (redisService.exists(fullQueueName)) sleep(500)
        } finally {
            // all work is off the queue, tell each worker to kill themselves when they're finished
            workers*.end(false)
        }
    }

The work queue that persist the Book domain objects in the database are very simple Jesque Job artefacts that are spring injected and have an active hibernate session. They can be of any class type. The only requirement is that they have a method named perform that is called and passed an item of work from the queue.

Here’s the example BookConsumerJob class that persists a Book to the database:

package com.naleid.example
 
import grails.converters.JSON
 
class BookConsumerJob {
    def bookService
 
    void perform(String bookJson) {
        bookService.updateOrInsertBook(JSON.parse(bookJson))
    }
}

You can see how simple the BookConsumerJob class is. It also calls out to the same bookService method that the serial batch import calls to import a Book.

One other neat thing about using Jesque is that it adheres to the Resque conventions for what gets stored in Redis. This means that you can gem install resque-web and then launch resque-web to get a nice monitoring platform for your Jobs and to see errors, or how much work is left in the queue.

2 Comments

Using Dropbox to Share (most of) Your Home Directory Across Multiple Computers

2011/10/3

I’m a very happy customer of Dropbox. It allows painless syncing of files across multiple computers without extra features to complicate it. The top rated answer on Quora to the question “Why is Dropbox more popular than other programs with similar functionality?” sums things up perfectly.

One of my favorite uses of Dropbox is to sync almost all of the non-machine specific configuration files and directories in my home directory across all my OSX computers (currently my iMac, MacBook Air, and my work laptop).

Doing this lets me make a configuration change to one computer and have it almost instantly available on any other computer without any manual steps.

This is especially important for my zshell and Vim configurations as I’m always tweaking those, but it’s also helpful to have my Documents, Downloads and Pictures shared.

I have a folder in my Dropbox directory called home, I use a script called link.sh to automatically create symlinks in my home directory to the things I’ve got stored in Dropbox.

Dropbox/home currently has these files and directories in it:

.ackrc
.dbvis
.groovy
.gvimrc
.hg
.hgignore_global
.ssh
.subversion
.vim
.viminfo
.vimrc
.zshenv
Desktop-starling.local/   # unique Desktop for my MacBook Air
Desktop-kestrel.local/    # unique Desktop for my iMac
Desktop-thrush.local/     # unique Desktop for my work MacBook Pro
Documents/
Downloads/
Pictures/
bin/

My Dropbox/home directory also has a shell script in it called link.sh:

#! /usr/bin/env bash
cd $(dirname $0)
 
function linkFile() {
    LINK_TO_NAME=$2
    if [ -z $LINK_TO_NAME ]; then
        LINK_TO_NAME=$1
    fi
    if [ -a $HOME/$LINK_TO_NAME ]; then
        echo "**** Found existing $LINK_TO_NAME, skipping..."
    elif [ -h $HOME/$LINK_TO_NAME ]; then
        echo "Already symlinked $LINK_TO_NAME, skipping..."
    else
        echo "Linking $1 to $LINK_TO_NAME"
        ln -s $PWD/$1 $HOME/$LINK_TO_NAME 
    fi
}
 
 
for F in $(ls -a1 | grep -v link.sh | grep -v Desktop | egrep -v "^..?$" | egrep -v "^.*un~$" | grep -v .DS_Store); do
    linkFile $F
done
 
export HOSTNAME=$(hostname)
 
if [ -d "Desktop-$HOSTNAME" ]; then
    linkFile "Desktop-$HOSTNAME" "Desktop"
else 
    echo "Unable to find Desktop-$HOSTNAME to link to Desktop"
fi

What the script does is:

  1. cd into the directory that the script is located in (it only symlinks files in the same directory)
  2. list out all of the files and directories in the same directory as the script
  3. filter out the things we don’t want to link (like ., .., the link.sh script itself, etc)
  4. For all of the files/directories that pass the filter, call linkFile to create a symlink in the current user’s home directory as long as there isn’t already a file or a symlink there
  5. Then look for a file called Desktop-$HOSTNAME where $HOSTNAME is the name of the current machine and create a ~/Dropbox symlink to it if it’s found.

It should be safe and non-destructive and only create symlinks when there isn’t anything else there with the same name.

I didn’t have my Pictures, Documents, and Downloads in my Dropbox for quite a while and was able to get away with the free 2GB plan. I recently upgraded to a paid Dropbox plan as I wanted those directories shared as well (though I exclude a couple of them from my work MacBook Pro).

For “special” directories like Desktop, Pictures, Documents, and Downloads, I needed to use sudo rm -r [dirname] to remove it before I could create the symlink (BACKUP THE DIRECTORY FIRST).

I’ve been using this for over a year, and haven’t noticed any apps that care that those directories are symlinks.

Also? I have used this shell script many times on my systems, and I think it’s safe, but PLEASE backup before using it, or deleting any directories. An adult crying is not a pretty sight :).

4 Comments

Smart Bash/Zsh Aliases to Run Appropriate Grails Version

2011/09/26

I’m currently on a project that has a couple of different apps that are using different versions of grails that need to be run concurrently. Switching a symlink no longer fit the way I needed to work so I came up with a couple bash/zsh aliases that are smart about the version of grails for the current directory.

These aliases work for both the grails as well as the grails-debug commands (for attaching a remote debugger).

If there is an application.properties file in the current directory, we can find the current version of grails for the app.

If there isn’t an application.properties file in the current directory, the script just defaults to whatever version of grails you’ve already set up as your default through the standard $GRAILS_HOME environment variable. You can use the grails symlink switching aliases that I created previously to easily move this between versions.

alias grails="execute_grails_version grails"
alias grails-debug="execute_grails_version grails-debug"
 
function execute_grails_version() {
    GRAILS_CMD=$1
    shift
    if [ -f application.properties ]; then
        export GRAILS_VERSION=`grep app.grails.version application.properties | sed -E 's/.*=(.*)/\1/'`
        export GRAILS_HOME="/usr/local/grails-$GRAILS_VERSION"
        echo "application.properties found, using \$GRAILS_HOME of $GRAILS_HOME"
    else 
        echo "application.properties NOT found, leaving \$GRAILS_HOME as $GRAILS_HOME"
    fi
 
    if [ ! -d $GRAILS_HOME ]; then
        echo "ERROR: Unable to find \$GRAILS_HOME directory at $GRAILS_HOME"
        exit 1
    fi
 
    echo $GRAILS_HOME/bin/$GRAILS_CMD $*
    $GRAILS_HOME/bin/$GRAILS_CMD $*
}

UPDATE: There are a few situations where aliases aren’t available (or are a pain to get available) such as when code is being executed as part of another application rather than from the command line. To get around this, these scripts (created by a co-worker of mine, @sjurgemeyer) could be put in your PATH, ahead of your $GRAILS_HOME/bin and used instead of the aliases above:

grails:

grails-version grails $*

grails-debug:

grails-version grails-debug $*

grails-version:

GRAILS_CMD=$1
shift
if [ -f application.properties ]; then
    export GRAILS_VERSION=`grep app.grails.version application.properties | sed -E 's/.*=(.*)/\1/'`
    export GRAILS_HOME="/usr/local/grails-$GRAILS_VERSION"
    echo "application.properties found, using \$GRAILS_HOME of $GRAILS_HOME"
else 
    echo "application.properties NOT found, leaving \$GRAILS_HOME as $GRAILS_HOME"
fi
 
if [ ! -d $GRAILS_HOME ]; then
    echo "ERROR: Unable to find \$GRAILS_HOME directory at $GRAILS_HOME"
    exit 1
fi
 
echo $GRAILS_HOME/bin/$GRAILS_CMD $*
$GRAILS_HOME/bin/$GRAILS_CMD $*
4 Comments

Dynamically setting Grails Log4J levels with the Console Plugin

2011/09/23

If you’ve got Burt Beckwith’s great Grails Console Plugin installed, it’s easy to tweak the logging levels dynamically in your grails application.

The quick and dirty way to switch your logging level dynamically, if you know the name of the logger is just to do this in your console window:

import org.apache.log4j.*
Logger.getLogger("org.springframework").level = Level.DEBUG

Sometimes, a few helper methods can help you see what the current config is (especially if you’ve changed some things), as well as figure out what the right loggers are to tweak. This sample script can be used in a grails console to make it easy to view and change the logging level to whatever you want, just cut and paste it into your application’s console window (in dev it defaults to: http://localhost:8080/yourAppName/console):

import org.apache.log4j.Logger
import org.apache.log4j.Level
import static org.apache.log4j.Level.*
 
def getRootLogger() { Logger.rootLogger }
def getAllLoggers() { rootLogger.loggerRepository.currentLoggers.toList().sort { it.name } }
def getActiveLoggers() { allLoggers.findAll { it.level } }
def getLogger(String logName) { rootLogger.getLogger(logName) }
def setLevel(String logName, Level level) { rootLogger.getLogger(logName).level = level }
 
def printLogger(logger) { println "${logger.name} -> ${logger.level}" }
def printAllLoggers() { allLoggers.each { printLogger(it) } }
def printActiveLoggers() { activeLoggers.each { printLogger(it) } }

This makes it easy to see what logs are currently active (those with a log level set):

printActiveLoggers()

prints something like:

grails.app.filters.LoggingFilters -> DEBUG
grails.app.filters.SecurityFilters -> DEBUG
grails.app.service.grails.plugin.redis.RedisService -> WARN
grails.app.task -> DEBUG
org.apache.cxf -> DEBUG
...

You can also list all loggers, which also adds in those loggers who’s log level is currently `null`:

printAllLoggers()

prints:

grails.app -> DEBUG
grails.app.bootstrap.BootStrap -> null
grails.app.bootstrap.QuartzBootStrap -> null
grails.app.codec.org.codehaus.groovy.grails.plugins.codecs.Base64Codec -> null
grails.app.codec.org.codehaus.groovy.grails.plugins.codecs.HTMLCodec -> null
grails.app.codec.org.codehaus.groovy.grails.plugins.codecs.HexCodec -> null
...

You can also dynamically grab/create a logger and set it’s logging level to something more or less verbose than it’s current value:

def logger = getLogger("grails.app.service.grails.plugin.redis.RedisService")
printLogger(logger)   // initially WARN
logger.level = INFO  
printLogger(logger)   // prints INFO

prints:

grails.app.service.grails.plugin.redis.RedisService -> WARN
grails.app.service.grails.plugin.redis.RedisService -> INFO

It’d be easy to turn this into a simple gsp/controller that accepts changes and can list things out. There are also a number of other plugins out there that let you view/change logging levels (including another one of Burt’s plugins, app info), but if you don’t have those installed, this is a quick way to see what’s going on with your application.

3 Comments

New Grails Redis Plugin Released

2011/08/8

I released the first version of the Grails Redis Plugin over the weekend. It’s a brand new plugin that takes the place of the previous redis plugin (which is being renamed to “redis-gorm” and has been refactored to use this plugin as a dependency). It’s version is 1.0.0.M7 just so it’s “higher” than the plugin it’s replacing, though I’d probably make it a 0.9 release if I were releasing it under a new name till I get a little more community feedback.

Quick description of what Redis is from the README:

The best definition of Redis that I’ve heard is that it is a “collection of data structures exposed over the network”.

Redis is an insanely fast key/value store, in some ways similar to memcached, but the values it stores aren’t just dumb blobs of data. Redis values are data structures like strings, lists, hash maps, sets, and sorted sets. Redis also can act as a lightweight pub/sub or message queueing system.

Redis is used in production today by a number of very popular websites including Craigslist, StackOverflow, GitHub, The Guardian, and Digg.

The Grails Redis plugin makes a Redis connection pool available (and injectable as a spring bean) to your Grails application.

To install the plugin, just execute this inside your application’s directory:

grails install-plugin redis

It allows you to transparently interact with your Redis instance by automatically handling retrieving a connection from the pool and ensuring that the connection is returned to the pool as it delegates to redis.

// overrides propertyMissing and methodMissing to delegate to redis
def redisService
 
redisService.foo = "bar"   
assert "bar" == redisService.foo   
 
redisService.sadd("months", "february")
assert true == redisService.sismember("months", "february")

One of the plugin’s greatest strengths is in the memoization methods and tag libraries that it adds. It’s a write-through cache (with optional TTL expiration). Before executing the closure/tag, it will check Redis to see if we’ve already calculated that value. If we have, we’ll just return the answer from Redis, otherwise, we’ll calculate it, and save it in Redis for future calls.

service method:

redisService.memoize("user:$userId:helloMessage") {
    // expensive to calculate method that returns a String
    "Hello ${security.currentLoggedInUser().firstName}"
}

taglib:

<redis:memoize key="mykey" expire="3600">
    <!-- 
        insert expensive to generate GSP content here 
 
        taglib body will be executed once, subsequent calls 
        will pull from redis till the key expires
    -->
    <div id='header'>
        ... expensive header stuff here that can be cached ...
    </div>
</redis:memoize>

Check out the full documentation on the github repository.

If you’re new to using Redis with Groovy, I created an introductory post and gave a presentation at gr8conf that are good starting places.

If you use OSX for development, you might also find these instructions for automatically launching Redis on startup with launchd useful.

5 Comments

Groovy Script Using Redis to Pick Conference Lottery Winners

2011/06/28

At the end of gr8conf today there were quite a few door prize giveaways. Winners were picked using a printout with attendees listed in (I’m assuming) random order. The guys running the lottery were going down the list and calling off names.

This was right after my talk on using Redis with Groovy and I thought to myself, “this is a perfect example of where a quick redis script could automate this and make it a bit more groovy”. So I threw together this script in about 15 minutes:
Read the rest of this article »

2 Comments