Logging to Splunk in Key/Value Pairs

| Comments

Splunk is a log aggregation tool with a very powerful query language that lets you easily mine the data from your logs. Taking an hour or two to learn the query language (which has some similarity to SQL) will greatly increase the usefulness of Splunk.

One of the Splunk logging best practices is to write out your logs into comma delimited key/value pairs that let Splunk interpret your data as queryable fields.

key1="value", key2="other value"...

Doing that will make key1 and key2 fields that can be queried and reported on.

If you don’t do this, you can still create fields out of unstructured data, but you have to use a relatively ugly regular expression syntax to create fields, ex if the log format is “Here is Key1 value and Key2 other value”

... | rex field=_raw "Here is Key1 (?<key1>.*) and Key2 (?<key2>.*)"

Here is a simple static method that will write out data given to it in vararg format as an appropriately escaped and delimited String:

package com.naleid.utils;

import org.apache.commons.lang3.StringEscapeUtils;
import java.util.Map;

public class LogUtil {

    public static String toSplunkString(Object... mapInfo) {
        return toSplunkString(varArgMap(mapInfo));
    }

    public static Map varArgMap(Object... mapInfo) {
        final Map result = new LinkedHashMap<>();
        if (mapInfo.length % 2 == 1) {
            throw new IllegalArgumentException("arguments must be even in number");
        }
        for (int i = 0; i < mapInfo.length; i += 2) {
            final Object o1 = mapInfo[i];
            if (!(o1 instanceof String)) {
                throw new IllegalArgumentException("odd arguments must be String values so they can be keys");
            }
            final Object o2 = mapInfo[i + 1];
            result.put((String) o1, o2);
        }
        return result;
    }

    public static  String toSplunkString(Map map) {
        StringBuilder buffer = new StringBuilder()

        for (Map.Entry entry : map.entrySet()) {
            if (buffer.length() > 1) {
                buffer.append(", ");
            }
            buffer.append(entry.getKey()==null?null:entry.getKey().toString())
                    .append("=\"")
                    .append(StringEscapeUtils.escapeJava(entry.getValue()==null?null:entry.getValue().toString()))
                    .append("\"");
        }
        return buffer.toString();
    }

}

Using that, when you write a log statement, you can now pass in a list of varargs (which will be transformed into a Map) and have it logged out in Splunk standard key/value format.

import static com.naleid.util.LogUtil;

...

log.info(toSplunkString("tag", "userPurchase", "purchaseId", 23, "userId", 123, "productId", 456, "price", 45.34));

Would output something like:

06:48:04,081 INFO [com.naleid.service.MyPurchaseService] tag="userPurchase", purchaseId="23", userId="123", productId="456", price="45.34"

This would let us write a Splunk query to show us things like number of purchases and average purchase price by user:

userPurchase | stats count, avg(price) by userId

splunk_user_purchases_with_total

or the number of purchases per product and that products total amount sold, sorted from highest to lowest

userPurchase | stats count as "# purchases", sum(price) as "total $" by productId | sort -"total $" 

splunk_purchase_count_with_total

You can also have alerts set up to automatically generate reports that are e-mailed periodically or create graphical dashboards.

The Splunk Search Reference and the Quick Reference Guide PDF (which is slightly outdated but still useful) are great references while you’re learning the query syntax. You can do much more powerful things than what I’m showing here.

The Splunk Search Examples shows a number of queries (and screenshots of results) along these lines.

In my next post (coming soon), I’ll show how to create a StopWatchAspect that automatically logs timing information for all of your service methods using the Splunk formatting. Then you can report and show timing statistics for your code and see what methods are taking the most time and how a method’s performance has changed over time.

Comments