Reputation: 1432
I am trying to calculate the availability of elasticsearch using prometheus. One of the jobs that runs get the cluster status as a value, being either 0, 1 or 2 where anything above 1 is considered unavailable. Using the answer from here does not work due to all the jobs succeeding and so the query has to do something along the lines of:
avg_over_time(es_cluster_status{cluster="name", instance="my_es"}>1[24h])
This does however not work due to the >1
.
Upvotes: 1
Views: 1810
Reputation: 10084
Prometheus does not support filtering samples in range vectors, the >1
would only work for filtering vectors based on their instant value.
The simplest workaround is for you to define a recorded rule that would behave just like the up
metric does (0
when your target is down, 1
otherwise). Something like es_cluster_status{cluster="name", instance="my_es"} <= 1
. Then you could apply avg_over_time()
on that metric and get the availability over any given range.
Upvotes: 1