Matthias M
Matthias M

Reputation: 14830

Reasons for OOMKilled in kubernetes

I try to get a general understanding of OOMKilled events and I've found 2 different reasons:

  1. Pod memory limit exceeded: If the Container continues to consume memory beyond its limit, the Container is terminated.

  2. Node out of memory: If the kubelet is unable to reclaim memory prior to a node experiencing system OOM, ... then kills the container ...

Questions

Upvotes: 19

Views: 38541

Answers (5)

A. S. Ranjan
A. S. Ranjan

Reputation: 418

In bitnami helm charts, I set all my resourcesPreset to none (not nano). Then if I require any specific setting, I put that in the resources: {} block. resourcesPreset: "none"

Upvotes: -1

Mihai Albert
Mihai Albert

Reputation: 1483

OOMKilled will only be reported for containers that have been terminated by the kernel OOM killer. It's important to note that it's the container that exceeds its memory limit that gets terminated (and by default restarted) as opposed to the whole pod (which can very well have other containers). Evictions on the other hand happen at the pod level, and are triggered by Kubernetes (specifically by the Kubelet running on every node) when the node is running low on memory*. Pods that have been evicted will report a status of Failed and a reason of Evicted.

Details about the reason can be seen in the Kubelet events for pod evictions (including details about how the sorting of the pods based on memory usage and QoS class is done so the "victim" is selected - a movie showing this is here) and in the kernel logs in the case of containers that get terminated by the OOM killer (a movie showing this is here).

*A node low on memory is a nuanced concept. If the node is truly out of memory, the the OOM killer will act at the system level and terminate from the list of all processes one that it deems a suitable "target" (such a radical example is here). Kubernetes has a different approach: with the node allocatable feature enabled (which is the default currently) it "carves" only a part of the node's memory for use by the pods. How much that is depends on the value of 3 parameters, captured in the previous link (kube-reserved, system-reserved, and eviction-threshold).

The catch is that the memory reserved by these 3 parameters will usually be larger than the respective actual memory usage: e.g. on an AKS DS2_v2 node (7 GiB of memory) the kube-reserved value is set at 1,638 MiB. Here's how this looks like (AKS doesn't use the system-reserved flag currently):

Node Allocatable and node memory capacity distribution for a 7-GiB DS2_v2 AKS node

kube-reserved is used to set aside memory for the Kubernetes system daemons (like Kubelet), so that the pods don't end up consuming too much, thus potentially starving those system daemons. If we assume the memory usage stays constant at 1,000 MiB for the Kubernetes daemons, the remaining 638 MiB in the black area above are still considered off-limits by Kubernetes. If the pods consume overall more than the "Allocatable" value for the node and start going into the red area above, and Kubernetes detects this in time (Kubelet checks every 10s by default) it will evict pods. So even though the node is not technically out-of-memory, Kubernetes uses its own "buffer" and takes corrective action way before the system would be in a real (and possibly crippling) low-memory situation.

If the allocations happen so fast so the red area above gets filled before the Kubelet has a chance to spot it (by default every 10s, as stated above) then the OOM killer will start terminating processes inside the pods' containers, and so you'll end up with OOMKilled events. You can very well not see a pod eviction in this case. Things can get tricky, and I tried doing a rough logical diagram of out-of-memory situations here: Flows leading to out-of-memory situations.

Upvotes: 9

blessedwithsins
blessedwithsins

Reputation: 1

OOM kill happens when Pod is out of memory and it gets killed because you've provided resource limits to it. You can see the Exit Code as 137 for OOM.

When Node itself is out of memory or resource, it evicts the Pod from the node and it gets rescheduled on another node. Evicted pod would be available on the node for further troubleshooting.

https://kubernetes.io/docs/concepts/scheduling-eviction/node-pressure-eviction/

Upvotes: 0

Matthias M
Matthias M

Reputation: 14830

Both problems result in different error states:

1: An exceeded pod memory limit causes a OOMKilled termination

2: Node out of memory causes a MemoryPressure and and pod eviction.

kubectl describe pod mypod-xxxx

...
Reason:         Evicted
Message:        Pod The node had condition: [MemoryPressure].
...

Upvotes: 14

paltaa
paltaa

Reputation: 3244

This is related to kubernetes QoS.

TLDR: - There are 3 different classes:

BestEffort: Pod with no resources defined, is the first to get killed when the node runs out of resources.

Burstable: When you set resource requests and limit to different values, which the limit - request is assured but if it needs to "burst" it will be shared with other objects and depends on how much resources at used at that point, not guaranteed.

Guaranteed: When you set the resource requests and limits to the same values, in that case the resources will be assured to the pod. In case nodes get short of resources will be the last to be killed.

Upvotes: 7

Related Questions