Michel
Michel

Reputation: 21

How to determine accurate request count in a time range with Spring Boot + Prometheus + Grafana

I just started trying to integrate micrometer, prometheus and Grafana into my microservices. At a first glance, it is very easy to use and there are many existing dashboard you can rely on. But the more I test the more it gets confusing. Maybe I don't understand the main idea behind this technology stack.

I would like to start my custom Grafana dashboard by showing the amount of request per endpoint for the selected time range (as a single stat), but I am not able to find the right query for that (and I am not sure it exists)

I tried different:

http_server_requests_seconds_count{uri="/users"}

Which always shows the current value. For example, if I sent 10 requests 30 minutes ago, this query will also return value 10 when I am changing changing the time range last 5 minutes (even though no request was entering the system during the last 5 minutes)

When I am using

increase(http_server_requests_seconds_count{uri="/users"}[$__range])

the query will not return the accurate value, instead something close to actual request amount. At least it works for a time range that doesn't include new incoming requests. In that case the query return 0.

So my question is, is there a way to use this Technology stack to get the amount of new requests for the selected period of time?

Upvotes: 1

Views: 3177

Answers (1)

anemyte
anemyte

Reputation: 20296

For the sake of performance when operating with millions of time series, many Prometheus functions show approximate and/or interpolated values. For example, the increase() function is basically a per-second rate() multiplied by the number of seconds in the interval. With such formula and possible missing data points, an accurate result is rather an exception than a normal thing.

The reason why it is so is that Prometheus exchanges accuracy for performance and reliability. It doesn't really matter if your server actual CPU usage is 86.3% instead of 86.4%, but it does matter whether you can get this information instantly. Prometheus even have this statement in their docs:

Prometheus values reliability. You can always view what statistics are available about your system, even under failure conditions. If you need 100% accuracy, such as for per-request billing, Prometheus is not a good choice as the collected data will likely not be detailed and complete enough. In such a case you would be best off using some other system to collect and analyze the data for billing, and Prometheus for the rest of your monitoring.

That being said, if you really need accurate values consider using something else. You can for example store logs and count lines (Grafana Loki, The Elastic Stack), or maybe write and retrieve this information from a traditional database with your own solution.

Upvotes: 1

Related Questions