Blured Derulb
Blured Derulb

Reputation: 241

Get the last value of a metric in a Datadog dashboard

I'm searching to display in my Datadog dashboard the last value of a metric in a QueryValue field.

For the moment, I'm using

            "queries": [
              {
                "query": "max:blabla.mycount{$env}",
                "data_source": "metrics",
                "name": "query1",
                "aggregator": "last"
              }
            ]

Is this the right way to do that ? For this series of mycount [20,1,5,3,2], which number will be taken ? Is it really the last one of the serie (2) or the biggest one in the serie (20) ?

Regards, Blured.

Upvotes: 1

Views: 6068

Answers (1)

draav
draav

Reputation: 1943

So there's going to be 3 levels of aggregation to consider: the Time Aggregation and Space Aggregation of your query, and then the aggregation of the query value widget on the frontend (which is what you're asking about). For now, let's understand time aggregation by thinking of a time series widget, and then we'll see what happens with the query value widget after.


Space aggregation is the simplest one. The idea is the you have multiple time series being submitted from multiple applications/ servers. If 20 computers send a metric all at the same time, which metric should we pick to display? You decide that with the aggregation chunk of your query, yours is currently set to max.

The idea is that you have to decide which out of the dozens or hundreds of instances of your metric is the one you want to display.

If you don't want to worry about space aggregation, you have to make you query specific enough that only 1 time series exists for that metric. For example a cpu metric will need to be scoped to at least the hostname. For a container metric, hostname isn't enough, you would need at least the container_id. For a database there should be a db_identifier or something that gets you just 1 result back.


Now for time aggregation, let's look at the docs a bit:

As Datadog stores data at a 1 second granularity, it cannot display all real data on graphs. See How data is aggregated in graphs for more details.

For a graph on a 1-week time window, it would require sending hundreds of thousands of values to your browser—and besides, not all these points could be graphed on a widget occupying a small portion of your screen.

...

The Datadog backend tries to keep the number of intervals to a number below ~300.

https://docs.datadoghq.com/dashboards/guide/query-to-the-graph/#proceed-to-time-aggregation

So for example if you are looking at a 5 minute window, the time aggregation will be as granular as possible. there are 300 seconds in 5 minutes, so every interval on the graph will represent 1 second. If we zoomed out to 10 minutes (600 seconds), we can only show data every 2 seconds. So each bucket will represent 2 data points (assuming the metric is submitted every second).

In most scenarios your metrics are being submitted at a 15 second interval. So you won't notice any time aggregation rollups until 15*300=4500 seconds (a bit over an hour).

You control this with the rollup function, as described in the docs. If you don't want to worry about time aggregation, just make sure your time range is zoomed in enough to not have any bucketing.


And now for the last level of aggregation, the query value widget. You now have obtained a set of 300 points from the backend, space and time aggregation has already been applied. Out of those 300 datapoints, which one do you want to display? You could choose the last point, or a sum of the points, or whatever.


Hopefully that helps!

Upvotes: 8

Related Questions