Reputation: 54929
I want to produce a simple histogram of a numeric variable X
.
I'm having trouble finding a clear example.
Since it's important that the histogram be meaningful more than beautiful, I would prefer to specify the bin-size rather than letting the tool decide. See: Data Scientists: STOP Randomly Binning Histograms
Upvotes: 4
Views: 6356
Reputation: 54929
Histograms are a primary tool for understanding the distribution of data. As such, Splunk automatically creates a histogram by default for raw event queries. So it stands to reason that Splunk should provide tools for you to create histograms of your own variables extracted from query results.
It may be that the reason this is hard to find is that the basic answer is very simple:
(your query) |rename (your value) as X
|chart count by X span=1.0
Select "Visualization" and set chart type to "Column Chart" for a traditional vertical-bar histogram.
There is an example of this in the docs described as "Chart the number of transactions by duration".
The span
value is used to control binning of the data. Adjust this value to optimize your visualization.
Warning: It is legal to omit span
, but if you do so the X-axis will be compacted non-linearly to eliminate empty bins -- this could result in confusion if you aren't careful about observing the bin labels (assuming they're even drawn).
If you have a long-tail distribution, it may be useful to partition the results to focus on the range of interest. This can be done using where:
(your query) |rename (your value) as X
|where X>=0 and X<=100
|chart count by X span=1.0
Alternatively, use a clamping function to preserve the out-of-range counts:
(your query) |rename (your value) as X
|eval X=max(0,min(X,100))
|chart count by X span=1.0
Another way to deal with long-tails is to use a logarithmic span
mode -- special values for span
include log2
and log10
(documented as log-span).
If you would like to have both a non-default span
and a compressed X-axis, there's probably a parameter for that -- but the documentation is cryptic.
I found that this 2-stage approach made that happen:
(your query) |rename (your value) as X
|bin X span=10.0 as X
|chart count by X
Again, this type of chart can be dangerously misleading if you don't pay careful attention to the labels.
Upvotes: 8