user2342643
user2342643

Reputation: 35

Azure Kubernetes Service (AKS) - Pod restart alert

I want to create an alert rule when a pod has restarted. i.e. if the pod restarts twice in a 30 min window

I have the following log analytics query:

KubePodInventory
| where ServiceName == "xxxx"
| project PodRestartCount, TimeGenerated, ServiceName
| summarize AggregatedValue = count(PodRestartCount) by ServiceName, bin(TimeGenerated, 30m) 

But setting the alert threshold to 2 in this case won't work since the PodRestartCount is not reset. Any help would be greatly appreciated. Maybe there is a better approach which I'm missing.

Upvotes: 2

Views: 5017

Answers (1)

djsly
djsly

Reputation: 1628

To reset the count between BIN() you can use the prev() function on a serialized output to compute the diff

KubePodInventory
| where ServiceName == "<service name>" 
| where Namespace == "<namespace name>"
| summarize AggregatedPodRestarts = sum(PodRestartCount) by bin(TimeGenerated, 30m) 
| serialize
| extend prevPodRestarts = prev(AggregatedPodRestarts,1)
| extend diff = AggregatedPodRestarts - prevPodRestarts
| where diff >= 2

this will output you the right diff over your BIN period.

TimeGenerated [UTC]         prevPodRestarts diff        AggregatedPodRestarts
5/12/2020, 12:00:00.000 AM  1,368,477       191,364     1,559,841   
5/11/2020, 11:00:00.000 PM  1,552,614       3,594       1,556,208   
5/11/2020, 10:00:00.000 PM  182,217         1,370,397   1,552,614

ref: https://learn.microsoft.com/en-us/azure/data-explorer/kusto/query/serializeoperator

https://learn.microsoft.com/en-us/azure/data-explorer/kusto/query/prevfunction

Upvotes: 4

Related Questions