Reputation: 847
Prometheus is providing me with some metrics for a queuing service (beanstalkd) via calls to a separate metrics provider (beanstalkd-exporter). A few times a day, I will notice that there is missing data for some of the queues.
There are a lot of queues, so I gather them all in a few graphs, queries for which might look like this:
tube_current_jobs_ready{tube=~".*some_suffix"}
This will get me all the metrics (queues) ending with "some_suffix". One or more of these — but not all — will sometimes have no data, as in a gap in the graph, not zero, but no data at all (presume that the whys and hows of that happening are out of scope for this question).
I already have alerts for when there is no data for the query, and they trigger when all the metrics returned are null, as expected. What I need is an alert for when there is no data for one or more of the metrics returned by the query.
Upvotes: 1
Views: 3735
Reputation: 18094
Try the following query for the alert:
count_over_time(tube_current_jobs_ready{tube=~".*some_suffix"}[D]) < N
This query returns the matching time series where the number of raw samples over the previous duration D
is less than N
. Parameters D
and N
must be chosen based on the expected interval between raw samples per each time series (aka scrape_interval
in Prometheus ecosystem). For example, the following query should return time series where the number of samples over the last 5 minutes is less than 4:
count_over_time(tube_current_jobs_ready{tube=~".*some_suffix"}[5m]) < 4
Upvotes: 4