Reputation: 32737
I'm trying to make a dashboard for disk space forecasting. I've got a prometheus like this:
predict_linear(
(1-(disk_volume_available_bytes{instance=~"$server"} / disk_volume_total_bytes{instance=~"$server"}))[32d:1d],
864000
) > 0.95
Which works well enough at cutting the list of disks to those that actually need attention. What I'd then like to do is have another query (either in the same panel or a different one - doesn't matter to me) that takes any disk identified from the previous list and get me the actual/observed metrics. Said another way, if a disk is forecasted to be above 95% full, I want both the forecast line as well as the actual usage data for that disk. And if it's forecasted to be below 95%, don't display anything for either the forecast or the actual.
Is this possible?
Upvotes: 3
Views: 701
Reputation: 20296
Here is an example that shows node_exporter_build_info
for those instances, where CPU utilization is over 30% (0.3):
node_exporter_build_info # this is the metric you want to see filtered
and on (instance) # and the rest is the filter terms, you won't see this on the panel
((1 - avg by(instance) (rate(node_cpu_seconds_total{mode="idle"}[5m]))) > 0.3)
The tricky part here is to join metric series on some labels so that there is no many-to-one join on either side. In the example above the only unique label is instance
, but in your case there might also be device
or mountpoint
, so you may need something like this:
the_metric_you_wanna_see
and on (instance, device, mountpoint) # put here a list of unique labels
(predict_linear(
(1-(disk_volume_available_bytes{instance=~"$server"} / disk_volume_total_bytes{instance=~"$server"}))[32d:1d],
864000
) > 0.95)
Also, since the query in question is rather expensive to compute and you need to repeat it once or twice, I suggest making Prometheus pre-calculate it: https://prometheus.io/docs/prometheus/latest/configuration/recording_rules/
Upvotes: 3