Reputation: 533
I was trying to build dashboard in Grafana for different API( java application). we started exporting the metrics to Prometheus by using these dependency.
val prometheus_scdw = "io.prometheus" % "simpleclient_dropwizard" % "0.0.23"
val prometheus_schs = "io.prometheus" % "simpleclient_hotspot" % "0.9.0"
val prometheus_scg = "io.prometheus" % "simpleclient_guava" % "0.9.0"
Metrics which we can see in exporter is like this( just for example):
# HELP controllers_autouserprofilecontroller_autologin_post_seconds_max
# TYPE controllers_autouserprofilecontroller_autologin_post_seconds_max gauge
controllers_autouserprofilecontroller_autologin_post_seconds_max 0.075604753
# HELP controllers_autouserprofilecontroller_autologin_post_seconds
# TYPE controllers_autouserprofilecontroller_autologin_post_seconds summary
controllers_autouserprofilecontroller_autologin_post_seconds_count 2529959.0
controllers_autouserprofilecontroller_autologin_post_seconds_sum 80214.121718928
I tried to see in GitHub to understand what exactly its means when they say count,sum or max but i didn't find any explanation. going with standard definition of these words like count is request severed, sum is time taken to served the request, max is highest time to served the request.
still wanted to ask if there is any better way or medium to understand these metrics.
I also used query for throughput for http_request_total to match the request counts in ALB monitoring which doesn't match.
Query used: sum(increase(http_request_total[1m]))
Is there anything i am missing here or small percentage of mismatch is acceptable.
My target is to build kind of dashboard for API performance, given currently we are exporting mentioned metrics type for all the API.
Upvotes: 2
Views: 4995
Reputation: 18084
The controllers_autouserprofilecontroller_autologin_post_seconds_count
metric is a counter, which counts the number of requests over time. So the average RPS can be calculated by applying rate() to controllers_autouserprofilecontroller_autologin_post_seconds_count
:
rate(controllers_autouserprofilecontroller_autologin_post_seconds_count[5m])
The [5m]
is a lookbehind window - 5 minutes in this case - for calculating the average RPS. See allowed time durations in these docs.
The average request duration over the last 5 minutes can be calculated with the following query:
increase(controllers_autouserprofilecontroller_autologin_post_seconds_sum[5m])
/
increase(controllers_autouserprofilecontroller_autologin_post_seconds_count[5m])
It uses the increase() function for calculating counter increases over the last 5 minutes.
Upvotes: 2