Atte
Atte

Reputation: 308

Reason for repeated pod eviction

A node on my 5-node cluster had memory usage peak at ~90% last night. Looking around with kubectl, a single pod (in a 1-replica deployment) was the culprit of the high memory usage and was evicted.

However, logs show that the pod was evicted about 10 times (AGE corresponds to around the time when memory usage peaked, all evictions on the same node)

NAMESPACE           NAME                                  READY   STATUS                   RESTARTS   AGE
example-namespace   example-deployment-84f8d7b6d9-2qtwr   0/1     Evicted                  0          14h
example-namespace   example-deployment-84f8d7b6d9-6k2pn   0/1     Evicted                  0          14h
example-namespace   example-deployment-84f8d7b6d9-7sbw5   0/1     Evicted                  0          14h
example-namespace   example-deployment-84f8d7b6d9-8kcbg   0/1     Evicted                  0          14h
example-namespace   example-deployment-84f8d7b6d9-9fw2f   0/1     Evicted                  0          14h
example-namespace   example-deployment-84f8d7b6d9-bgrvv   0/1     Evicted                  0          14h
...

node memory usage graph: mem_usg_graph

Status:         Failed
Reason:         Evicted
Message:        Pod The node had condition: [MemoryPressure].

My question is to do with how or why this situation would happen, and/or what steps can I take to debug and figure out why the pod was repeatedly evicted? The pod uses an in-memory database so it makes sense that after some time it eats up a lot of memory, but it's memory usage on boot shouldn't be abnormal at all.

My intuition would have been that the high memory usage pod gets evicted, deployment replaces the pod, new pod isn't using that much memory, all is fine. But the eviction happened many times, which doesn't make sense to me.

Upvotes: 1

Views: 2328

Answers (1)

Bazhikov
Bazhikov

Reputation: 841

The simplest steps are to run the following commands to debug and read the logs from the specific Pod.

Look at the Pod's states and last restarts:

kubectl describe pods ${POD_NAME}

Look for it's node name and run the same for the node:

kubectl describe node ${NODE_NAME}

And you will see some information in Conditions section.

Examine pod logs:

kubectl logs --previous ${POD_NAME} ${CONTAINER_NAME}

If you want to rerun your pod and watch the logs directly, rerun your pod and do the command:

kubectl logs ${POD_NAME} -f

More info with kubectl logs command and its flags here

Upvotes: 1

Related Questions