Reputation: 31
I have a strange setup where I need to add together the read rates of a number of the disks in a server.
Each of these queries works fine by itself, but when I add them with a plus sign I get "Empty Query Result"
irate(node_disk_read_bytes_total{instance="10.0.0.10:9100", device="sdc"}[1m]) + irate(node_disk_read_bytes_total{instance="10.0.0.10:9100", device="sdd"}[1m])
It may be important to note that I can't just sum all of the devices, I need to add only specific devices. I'm pretty new to this, but unfortunately, this seems to be one of those things that you can't google properly because of common words. Or maybe I just don't know the right question to ask.
Upvotes: 3
Views: 3700
Reputation: 18084
Prometheus performs arithmetic operations such as q1 + q2
in the following way:
q1
and q2
.+
it searches for the corresponding time series at the right side of +
with the same set of label="value"
pairs. If it cannot find the matching pair, then the series at the left side is skipped.+
is found, then Prometheus sums point values for the series pair individually per each point timestamp.+
exist, then go to 3.See these docs for details.
In your case the query returns empty result because series on the left side of +
contain label device="sdc"
while series on the right side of +
contain label device="sdd"
. This means that Prometheus cannot locate series pairs on the left and the right side of +
with identical sets of labels. See the step 3 in the algorithm above.
There are the followimg workarounds exist for this issue:
sum()
:sum(
rate(node_disk_read_bytes_total{instance="10.0.0.10:9100", device=~"sdc|sdd"}[1m])
)
ignoring
modifier with +
operator in order to instruct Prometgeus to ignore the device
label during searching for matching series pairs with identical labelsets:rate(node_disk_read_bytes_total{instance="10.0.0.10:9100", device="sdc"}[1m])
+ ignoring(device)
rate(node_disk_read_bytes_total{instance="10.0.0.10:9100", device="sdd"}[1m])
See these docs for details.
P.s. it isn't recommended to use irate()
function, since it doesn't capture spikes. It just returns results calculated on a jumpy subset of raw samples, so it may return completely different results on every graph refresh. See this article for details.
Upvotes: 2
Reputation: 4350
Maybe the metrics don't line up exactly in time? I tried this: avg(metric1)-avg(metric2)
instead of metric1-metric2
and it seemed to work
Upvotes: 3