Lqjing
Lqjing

Reputation: 103

How to calculate the average value in a Prometheus query from Grafana

I was trying to create a Prometheus graph on Grafana, but I can't find the function calculating the average value.

For example, to create a graph for read_latency, the results contain many tags. If there are 3 machines, there are 3 separate tags, for machine1, machine2, machine3. Here is a graph:

Prometheus

I want to combine these three together, so there will be only one tag, machines, and the value is the average of those three.

It seems that Prometheus query function doesn't have something like average(), so I am not sure how to do this.

I used to work on InfluxDB, and the graph works like this:

influxDB

Upvotes: 10

Views: 81544

Answers (3)

Tobias Wiesenthal
Tobias Wiesenthal

Reputation: 286

I think you are searching for the avg() operation. see documentation

Upvotes: 9

valyala
valyala

Reputation: 17784

Short answer: use avg() function to return the average value across multiple time series. For example, avg(metric) returns the average value for time series with metric name.

Long answer: Prometheus provides two functions for calculating the average:

  • avg_over_time calculates the average over raw sample stored in the database on the lookbehind window specified in square brackets. The average is calculated independently per each matching time series. For example, avg_over_time(metric[1h]) calculates average values for raw samples over the last hour per each time series with metric name.
  • avg calculates the average over multiple time series. The average is calculated independently per each point on the graph.

If you need to calculate the average over raw samples across all the time series, which match the given selector, per each time bucket, e.g.:

SELECT
  time_bucket('5 minutes', timestamp) AS t,
  avg(value)
FROM table
GROUP BY t

Then the following PromQL query must be used:

sum(sum_over_time(metric[$__interval])) / sum(count_over_time(metric[$__interval]))

Do not use avg(avg_over_time(metric[$__interval])), since it returns average of averages, which isn't equal to real average. See this explanation for details.

Upvotes: 8

Tombart
Tombart

Reputation: 32378

Use built-in $__interval variable, where node, name are custom labels (depending on you metrics):

sum(avg_over_time(some_metric[$__interval])) by (node, name)

or fixed value like 1m,1h etc:

sum(avg_over_time(some_metric[1m])) by (node, name)

You can filter using Grafana variables:

sum(avg_over_time(some_metric{cluster=~"$cluster"}[1m])) by (node, name)

Upvotes: 6

Related Questions