Zvi Mints
Zvi Mints

Reputation: 1142

Prometheus: find max RPS

Say I have two metrics in Prometheus, both counters:

Ok:

nginx_ingress_controller_requests{prometheus_from="$cluster", ingress="brand-safety-phoenix-service", status="200"}

Failure:

nginx_ingress_controller_requests{prometheus_from="$cluster", ingress="brand-safety-phoenix-service", status!="200"}

Total:

nginx_ingress_controller_requests{prometheus_from="$cluster", ingress="brand-safety-phoenix-service"}

My question is how to find on which RPS failures occurred as promQL query

I'm expecting the following response:

400

Means, that if pod receives > 400 RPS, Failure metric begin to happen


full query (after got answered) enter image description here

sum((sum(rate(nginx_ingress_controller_requests{prometheus_from="$cluster", ingress="brand-safety-phoenix-service"}[$__rate_interval])) without (status))
  and
  (sum(rate(nginx_ingress_controller_requests{prometheus_from="$cluster", ingress="brand-safety-phoenix-service", status !="200"}[$__rate_interval])) without (status) > 0))

Upvotes: 1

Views: 3886

Answers (1)

valyala
valyala

Reputation: 18084

You need the following query:

rps_total and (rps_failure > 0)

The and binary operation is used for matching right-hand time series to the left-hand series with the same set of labels. See these docs for details on matching rules.

Let's substitute rps_total and rps_failure with the actual time series given matching rules mentioned above.

  • The rps_total is substituted with sum(nginx_ingress_controller_requests{prometheus_from="$cluster", ingress="brand-safety-phoenix-service"}) without (status) . The sum(...) without (status) is needed in order to sum metrics across all the status labels grouped by the remaining labels.

  • The rps_failure is substituted with sum(nginx_ingress_controller_requests{prometheus_from="$cluster", ingress="brand-safety-phoenix-service", status!="200"}) without (status)

Then the final PromQL query will look like:

(
  sum(nginx_ingress_controller_requests{prometheus_from="$cluster", ingress="brand-safety-phoenix-service"}) without (status)
  and
  (sum(nginx_ingress_controller_requests{prometheus_from="$cluster", ingress="brand-safety-phoenix-service", status!="200"}) without (status) > 0)
)

Upvotes: 1

Related Questions