user561979
user561979

Reputation: 29

Prometheus use of avg_over_time with absent

we have started to use prometheus for monitoring our infrastructure. One service has the following alert configured:

With that, we receive alerts if "up" is zero or if no metrics are reachable.

Now we want a grafana "single stat" panel that shows the "uptime" of the service, but "absent" can't be used with "avg_over_time", there is an option for including something like "absent" in our uptime's panel?

Upvotes: 1

Views: 18571

Answers (1)

Alin Sînpălean
Alin Sînpălean

Reputation: 10134

You could approximate it by something like this:

sum_over_time(up{job="service"}[24h]) / sum_over_time(up{job="prometheus"}[24h])

This would divide the number of samples that recorded your service as being "up" (over the past 24 hours) by the number of samples that recorded Prometheus being "up".

Else, you could use a recording rule to record something similar to your alert condition, that has a value of 1 if your service is up and 0 otherwise. Then you could use avg_over_time() over that metric.

Upvotes: 1

Related Questions