Reputation: 11
I like to monitor the pods using Prometheus rules so that when a pod restart, I get an alert. I wonder if anyone have sample Prometheus alert rules look like this but for restarting
- alert: KubePodCrashLooping
annotations:
message: Pod {{ $labels.namespace }}/{{ $labels.pod }} ({{ $labels.container
}}) is restarting {{ printf "%.2f" $value }} times / 5 minutes.
expr: |
rate(kube_pod_container_status_restarts_total{job="kube-state-metrics"}[15m]) * 60 * 5 > 0
for: 1h
labels:
severity: critical
Upvotes: 1
Views: 2711
Reputation: 21
you can try this (alerting if a container is restarting more than 5 times during the last hour):
- alert: PodRestarts
annotations:
message: Pod {{ $labels.namespace }}/{{ $labels.pod }} ({{ $labels.container
}}) is restarting {{ printf "%.2f" $value }} times during the last hour.
expr: increase(kube_pod_container_status_restarts_total{container!~"kubernetes-vault-renew"}[1h]) > 5
labels:
severity: critical
Upvotes: 2