Using top to show common field values

A very common question that may often arise is, "What values are most common?" When looking for errors, you are probably interested in figuring out what piece of code has the most errors. The top command provides a very simple way to answer this question.

Let's step through a few examples.

First, run a search for errors:

sourcetype="tm1*" error

The preceding example searches for the word error in all sourcetypes starting with the character string "tm1*" (with the asterisk being the wildcard character).

In my data, we find events containing the word error, a sample of which is listed in the following screenshot:

Since I happen to know that the data I am searching is made up of application log files generated throughout the year, it might be interesting to see the month that had the most errors logged. To do that, we can simply add | top date_month to our search, like so:

sourcetype="tm1*" error | top date_month

The results are transformed by top into a table like the following one:

From these results, we see that october is logging significantly more errors than any other month. We should probably take a closer look at the activity that occurred during that month.

Next, perhaps we would like to determine if there is a particular day of the week when these errors are happening. Adding another field name to the end of the command instructs top to slice the data again. For example, let's add date_wday to the end of our previous query, like so:

sourcetype="tm1*" error | top date_month date_wday

The results might look like the following screenshot:

In these results, we see that wednesday is logging the most errors from the month of october. If we simply wanted to see the distribution of errors by date_wday, you could specify only the user field, like so:

sourcetype=tm1* error | top date_wday

Controlling the output of top

The default behavior for top is to show the 10 largest counts. The possible row count is the product of all fields specified, in this case date_month and date_wday. Using our data in this example, there are 8 possible combinations. If you would like to see less than 10 rows (or in our example, less than 8), add the argument limit, like so:

sourcetype=tm1* error | top limit=4 date_month date_wday

Arguments change the behavior of a command; they take the form of name=value. Many commands require the arguments to immediately follow the command name, so it's a good idea to always follow this structure.

Each command has different arguments, as appropriate. As you type in the search bar, a help drop-down box will appear for the last command in your search, as shown in the following screenshot:

The Help option takes you to the documentation for that command at http://www.splunk.com and More >> provides concise documentation inline.

Let's use a few arguments to make a shorter list but also roll all other results into another line:

sourcetype=tm1* error
| top
limit=4
useother=true
otherstr="everything else
date_month date_wday

This produces results like those shown in the following screenshot:

The last line represents everything that didn't fit into the top four. The (top) option, useother, enables this last row, while option otherstr controls the value printed instead of the default value other.

The reader may review the Splunk documentation for additional information on the top command and options at http://docs.splunk.com/Documentation/Splunk/6.2.3/SearchReference/Top

For the opposite of top, see the rare command.