Reputation: 14830
I try to get a general understanding of OOMKilled events and I've found 2 different reasons:
Pod memory limit exceeded: If the Container continues to consume memory beyond its limit, the Container is terminated.
Node out of memory: If the kubelet is unable to reclaim memory prior to a node experiencing system OOM, ... then kills the container ...
Questions
Upvotes: 19
Views: 38541
Reputation: 418
In bitnami helm charts, I set all my resourcesPreset to none (not nano). Then if I require any specific setting, I put that in the resources: {} block.
resourcesPreset: "none"
Upvotes: -1
Reputation: 1483
OOMKilled
will only be reported for containers that have been terminated by the kernel OOM killer. It's important to note that it's the container that exceeds its memory limit that gets terminated (and by default restarted) as opposed to the whole pod (which can very well have other containers). Evictions on the other hand happen at the pod level, and are triggered by Kubernetes (specifically by the Kubelet running on every node) when the node is running low on memory*. Pods that have been evicted will report a status of Failed
and a reason of Evicted
.
Details about the reason can be seen in the Kubelet events for pod evictions (including details about how the sorting of the pods based on memory usage and QoS class is done so the "victim" is selected - a movie showing this is here) and in the kernel logs in the case of containers that get terminated by the OOM killer (a movie showing this is here).
*A node low on memory is a nuanced concept. If the node is truly out of memory, the the OOM killer will act at the system level and terminate from the list of all processes one that it deems a suitable "target" (such a radical example is here). Kubernetes has a different approach: with the node allocatable feature enabled (which is the default currently) it "carves" only a part of the node's memory for use by the pods. How much that is depends on the value of 3 parameters, captured in the previous link (kube-reserved
, system-reserved
, and eviction-threshold
).
The catch is that the memory reserved by these 3 parameters will usually be larger than the respective actual memory usage: e.g. on an AKS DS2_v2 node (7 GiB of memory) the kube-reserved
value is set at 1,638 MiB. Here's how this looks like (AKS doesn't use the system-reserved
flag currently):
kube-reserved
is used to set aside memory for the Kubernetes system daemons (like Kubelet), so that the pods don't end up consuming too much, thus potentially starving those system daemons. If we assume the memory usage stays constant at 1,000 MiB for the Kubernetes daemons, the remaining 638 MiB in the black area above are still considered off-limits by Kubernetes. If the pods consume overall more than the "Allocatable" value for the node and start going into the red area above, and Kubernetes detects this in time (Kubelet checks every 10s by default) it will evict pods. So even though the node is not technically out-of-memory, Kubernetes uses its own "buffer" and takes corrective action way before the system would be in a real (and possibly crippling) low-memory situation.
If the allocations happen so fast so the red area above gets filled before the Kubelet has a chance to spot it (by default every 10s, as stated above) then the OOM killer will start terminating processes inside the pods' containers, and so you'll end up with OOMKilled
events. You can very well not see a pod eviction in this case. Things can get tricky, and I tried doing a rough logical diagram of out-of-memory situations here: Flows leading to out-of-memory situations.
Upvotes: 9
Reputation: 1
OOM kill happens when Pod is out of memory and it gets killed because you've provided resource limits to it. You can see the Exit Code as 137 for OOM.
When Node itself is out of memory or resource, it evicts the Pod from the node and it gets rescheduled on another node. Evicted pod would be available on the node for further troubleshooting.
https://kubernetes.io/docs/concepts/scheduling-eviction/node-pressure-eviction/
Upvotes: 0
Reputation: 14830
Both problems result in different error states:
1: An exceeded pod memory limit causes a OOMKilled
termination
2: Node out of memory causes a MemoryPressure
and and pod eviction.
kubectl describe pod mypod-xxxx
...
Reason: Evicted
Message: Pod The node had condition: [MemoryPressure].
...
Upvotes: 14
Reputation: 3244
This is related to kubernetes QoS.
TLDR: - There are 3 different classes:
BestEffort: Pod with no resources defined, is the first to get killed when the node runs out of resources.
Burstable: When you set resource requests and limit to different values, which the limit - request is assured but if it needs to "burst" it will be shared with other objects and depends on how much resources at used at that point, not guaranteed.
Guaranteed: When you set the resource requests and limits to the same values, in that case the resources will be assured to the pod. In case nodes get short of resources will be the last to be killed.
Upvotes: 7