Reputation: 13046
I'm exporting some metrics about running tasks, the available data includes the current number of tasks by their status and queue:
# TYPE gauge
tasks{queue="high", status="queued"} 2.0
tasks{queue="high", status="started"} 1.0
tasks{queue="high", status="successful"} 5.0
tasks{queue="high", status="failed"} 1.0
tasks{queue="low", status="queued"} 1.0
tasks{queue="low", status="started"} 2.0
tasks{queue="low", status="successful"} 3.0
tasks{queue="low", status="failed"} 2.0
These numbers change regularly when the tasks are added or expired from the database, so for example the failed
tasks number will go up and down depending on the tasks in the database at the time of collecting the data.
I don't have a way to get the total tasks count, so that's all the data that I have, I want to calculate the percentage of the tasks by their status
label and create a graph of this value using Grafana.
How the percentage should be calculated?
What I've tried so far:
Get the percentage of all successful tasks:
( sum(tasks{status="successful"}) / sum(tasks) ) * 100
Get the percentage of successful tasks by queue:
( sum(tasks{status="finished"}) by (queue) / sum(tasks) by (queue) ) * 100
How can I get this percentage by time? For example when setting the time range in Grafana? I can use the variable $__range
but how should I do the calculation?
I have other data where I have count metrics and I'm doing the following:
sum(increase(tasks_total{status="success"}[$__range])) /
sum(increase(tasks_total{status="started"}[$__range]))
But these are counts and these calculation don't apply to gauge metrics.
Upvotes: 4
Views: 10449
Reputation: 17800
The per-status
percentage for the currently running tasks can be calculated with the following PromQL query:
100 * (sum(tasks) by (status) / on() group_left() sum(tasks))
It uses on()
and group_left()
modifiers for /
operation in order to augment the default behavior for matching time series on the left and the right side of /
operation. See these docs for details.
If you need to get per-queue
per-status
percentage for the currently running tasks, then the following query should work:
100 * (
sum(tasks) by (queue, status)
/ on(queue) group_left()
sum(tasks) by (queue)
)
If you need per-status
percentage for tasks, which were run during the selected time $__range
, then the following query should work:
100 * (
sum(sum_over_time(tasks[$__range])) by (status)
/ on() group_left()
sum(sum_over_time(tasks[$__range]))
)
It uses sum_over_time function for aggregating the number of tasks over the given $__range
. The $__range
can be substituted with any supported time duration - see these docs.
Upvotes: 6
Reputation: 127
I think you need to use group_left. See an example: https://www.robustperception.io/using-group_left-to-calculate-label-proportions
In your case:
sum without (queue) tasks{status="successful"}
/ ignoring(status) group_left
sum without (queue, status) tasks
Upvotes: 0