Nicky Logan
Nicky Logan

Reputation: 120

Is it possible to get the average of histogram_quantile of the last t minutes?

What I'm intending to find is the average of the p99 latency in the last t minutes.

I tried this query, but it returned with the error "ranges only allowed for vector selectors"

avg_over_time(histogram_quantile(0.99, sum(rate(latency_buckets{service="foo"}[5m])) by (le))[5m])

From what I understand, what histogram_quantile does is return an instant value (let's say p99) and there is no way to get a series of p99 values over a specified interval. If so, are there any functions that can achieve the same goal?

Upvotes: 3

Views: 2228

Answers (2)

Sudhakar MNSR
Sudhakar MNSR

Reputation: 764

The above solution to use subquery works. Adding one more option just incase if using subquery is a performance concern for you as mentioned here and here.

Before looking at alternate solution lets understand why the actual query in question fails.

The query result histogram_quantile(0.99, sum(rate(latency_buckets{service="foo"}[5m])) by (le)) is a derived value not a raw value(stored in TSDB). Range vector selector can only be applied to raw values. outer [5m] cannot be applied to result which is derived result.

To get arround this we can store the result of inner query in TSDB by creating a recording rule for the inner query. So create recording rule for histogram_quantile(0.99, sum(rate(latency_buckets{service="foo"}[5m])) by (le)) and use it in the outer query avg_over_time(<recording_rule>[5m]). Refer another similar question here

Upvotes: 0

anemyte
anemyte

Reputation: 20196

It is possible using subquery syntax:

avg_over_time(instant_query[interval:resolution])

An example with your query (avg over 1h):

avg_over_time(
  histogram_quantile( # the instant query
    0.99,
    sum(
      rate(latency_buckets{service="foo"}[5m])
    ) by (le)
  )[1h:] # subquery [ interval : resolution (by default == scrape interval)]
)

Upvotes: 4

Related Questions