What is the best practice to have request and limit values to a pod in K8s?

Question

Recently we faced some issue in our AKS cluster that the nodes memory were simply got increased as the pods memory request was high ( request-2Gi, memory 2Gi) which increased the node count. So inorder to reduce the node counts we reduced the request memory to 256MI and limit to same value (2GB). After this we noticed some strange behaviour in our cluster.

there is big difference in % of request and limits of our resource.
More clearly the Limits values are showing 602% and 478 % of Actual, and much difference between request %. Is it normal or nor harm to keep this difference between request and limit ?

 Resource                       Requests      Limits
  --------                       --------      ------
  cpu                            1895m (99%)   11450m (602%)
  memory                         3971Mi (86%)  21830Mi (478%)
  ephemeral-storage              0 (0%)        0 (0%)
  hugepages-1Gi                  0 (0%)        0 (0%)
  hugepages-2Mi                  0 (0%)        0 (0%)
  attachable-volumes-azure-disk  0             0

We noticed that our nodes consumption of memory is showing more than 100 %, which is a starnge behaviour that how a node can consume more memory than actually it has.

NAME                                CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%
aks-nodepoolx-xxxxxxxx-vmss00000x   151m         7%     5318Mi          116%

NAME                                    READY   STATUS    RESTARTS   AGE
mymobile-mobile-xxxxx-ddvd6   2/2     Running   0          151m
myappsvc-xxxxxxxxxx-2t6gz     2/2     Running   0          5h3m
myappsvc-xxxxxxxxxx-4xnsh     0/2     Evicted   0          4h38m
myappsvc-xxxxxxxxxx-5b5mb     0/2     Evicted   0          4h28m
myappsvc-xxxxxxxxxx-5f52g     0/2     Evicted   0          4h19m
myappsvc-xxxxxxxxxx-5f8rz     0/2     Evicted   0          4h31m
myappsvc-xxxxxxxxxx-66lc9     0/2     Evicted   0          4h26m
myappsvc-xxxxxxxxxx-8cnfb     0/2     Evicted   0          4h27m
myappsvc-xxxxxxxxxx-b9f9h     0/2     Evicted   0          4h20m
myappsvc-xxxxxxxxxx-dfx9m     0/2     Evicted   0          4h30m
myappsvc-xxxxxxxxxx-fpwg9     0/2     Evicted   0          4h25m
myappsvc-xxxxxxxxxx-kclt8     0/2     Evicted   0          4h22m
myappsvc-xxxxxxxxxx-kzmxw     0/2     Evicted   0          4h33m
myappsvc-xxxxxxxxxx-lrrnr     2/2     Running   0          4h18m
myappsvc-xxxxxxxxxx-lx4bn     0/2     Evicted   0          4h32m
myappsvc-xxxxxxxxxx-nsc8t     0/2     Evicted   0          4h29m
myappsvc-xxxxxxxxxx-qmlrj     0/2     Evicted   0          4h24m
myappsvc-xxxxxxxxxx-qr75w     0/2     Evicted   0          4h27m
myappsvc-xxxxxxxxxx-tf8bn     0/2     Evicted   0          4h20m
myappsvc-xxxxxxxxxx-vfcdv     0/2     Evicted   0          4h23m
myappsvc-xxxxxxxxxx-vltgw     0/2     Evicted   0          4h31m
myappsvc-xxxxxxxxxx-xhqtb     0/2     Evicted   0          4h22m

David Maze · Accepted Answer

The large number of Evicted pods suggests you've set the resource requests too low. An 8x difference between requests and limits "feels" very large to me.

Given your setup, the kubectl describe node output looks about right to me. Notice that the resource requests are very close to 100%: Kubernetes will keep scheduling pods on a node until its resource requests get up to 100%, and whatever the corresponding limits are, they are. So if you've managed to schedule 7x 256 MiB request pods, that would request 1,792 MiB of memory (88% of a 2 GiB node); and if each pod specifies a limit of 2 GiB, then the total limits would be 7x 2048 MiB or 14,336 MiB (700% of the physical capacity).

If the actual limits are that much above the physical capacity of the system, and the pods are actually using that much memory, then the system will eventually run out of memory. When this happens, a pod will get Evicted; which pod depends on how much its actual usage exceeds its request, even if it's within its limit. Node-pressure Eviction in the Kubernetes documentation describes the process in more detail.

Setting these limits well is something of an art. If the requests and limits are equal, then the pod will never be evicted (its usage can't exceed its requests); but in this case if the pod isn't using 100% of its requested memory then the node will be underutilized. If they're different then it's easier to schedule pods on fewer nodes, but the node will be overcommitted, and something will get evicted when actual memory usage increases. I might set the requests to the expected (observed) steady-state memory usage, and the limits to the highest you'll ever expect to see in normal operation.

What is the best practice to have request and limit values to a pod in K8s?

Answers (2)

Related Questions