Reputation: 926
I have already tried finding resources and articles online for how to create alerts using Grafana 8 UI about the CPU and/or memory usage of my kubernetes cluster pods, but I couldn't find anything, neither on youtube, google, discord, stackoverflow nor reddit.
Does anyone know any guide on how to do that?
The goal is to literally create an alert rule that will send a slack message when the CPU or Memory usage of my kubernetes cluster pods pass over X%. The slack app to receive the grafana message is working, but I have no idea how would be the grafana query.
PS.: I am using Prometheus and node-exporter.
Upvotes: 1
Views: 5177
Reputation: 2323
You can try this query for creating an alert if the CPU or Memory usage is above threshold (let say 85%).
sum(rate(container_cpu_usage_seconds_total{namespace="$namespace", pod="$pod", container!="POD", container!="", pod!=""}[1m])) by (pod) / sum(kube_pod_container_resource_limits{namespace="$namespace", pod="$pod", resource="cpu"}) by (pod) * 100
You can check CPU utilization of all pods in the cluster by running:
sum(rate(container_cpu_usage_seconds_total{container_name!="POD",pod_name!=""}[5m]))
If you want to check CPU usage of each running pod you can use using:
sum(rate(container_cpu_usage_seconds_total{container_name!="POD",pod_name!=""}[5m])) by (pod_name).
To see actual CPU usage, look at metrics like container_cpu_usage_seconds_total (per container CPU usage)
or maybe even process_cpu_seconds_total (per process CPU usage).
You can create alert rule in grafana by following the steps provided in the document and refer to the link for more information.
Upvotes: 1