Erez Rabih
Erez Rabih

Reputation: 15808

Grafana self-metrics and tracking alert triggers

I'm using Grafana's (V6.5.2) native alerts system and I'm trying to figure out if there's a way to scrape metrics about Grafana itself.

Specifically, I'm looking for a time series that will show the triggering of each specific alert over time. The incentive is to see trends of alert triggers to see if our actions reduced the number of alerts as expected.

I had a look at the /metrics endpoint Grafana exposes and found grafana_alerting_result_total but this is a sum for all alerts and not a time-series specific for each defined alert.

Is there a way to track alerts state per specific alert?

Upvotes: 0

Views: 388

Answers (1)

Jan Garaj
Jan Garaj

Reputation: 28714

You can enable export of internal metrics into Graphite:

https://github.com/grafana/grafana/blob/v6.5.2/conf/defaults.ini#L611-L615

# Send internal Grafana metrics to graphite
[metrics.graphite]
# Enable by setting the address setting (ex localhost:2003)
address =
prefix = prod.grafana.%(instance_name)s.

So you will have overall time series in the Graphite.

You need to use Grafana logs for more granular alert stats. E.g. switch Grafana logs to json format, increase debug level and insert them into Elasticsearch. Then you can filter by logger=alerting.engine and you can graph/group/process those logs with more precise granularity. Example log line:

{"alertId":453,"attemptID":1,"firing":true,"logger":"alerting.engine","lvl":"dbug","msg":"Job Execution completed","name":"Packet Loss alert","t":"2021-08-10T09:53:01.617388937Z","timeMs":75.277014}

Upvotes: 1

Related Questions