Reputation: 1149
I have a PostgreSQL cluster with three nodes with Patroni. The cluster manages a very high workload and for this reason, it runs in production on bare metal machines. We need to migrate this infrastructure to Kubernetes (for several reasons) and I am doing some performance tests executed with PgBench. First I compared Baremetal vs Virtual Machine and I got very small degradation. Then I compared VSI vs Kubernetes to understand the overhead added by K8s.
Now I am trying to fine-tune CPU and memory. K8s runs on Worker nodes with 48 vCPU and 192 Gb. However, once PostgreSQL was deployed I still see:
NAME CPU(cores) MEMORY(bytes)
postgresql-deployment-5c98f5c949-q758d 2m 243Mi
even if I allocated the following to the PostgreSQL container:
resources:
requests:
memory: 64Gi
limits:
memory: 64Gi
if I run:
kubectl top pod <pod name> -n <namespace>
I got the following:
NAME CPU(cores) MEMORY(bytes)
postgresql-deployment-5c98f5c949-q758d 2m 244Mi
the same appears from K8s dashboard even if the result of:
kubectl describe pod <pod name> -n <namespace>
show that the Pod runs with a Guarantee QoS and 64Gi of RAM for requested and limit.
How this is supposed to work?
Another thing I don't understand is the CPU limit
and requested
. I expect to enter something like this:
resources:
requests:
cpu: 40
memory: 64Gi
limits:
cpu: 40
memory: 64Gi
I expected to reserve 40 vCPU to my container but during the deployment, I see insufficient CPU on the node when I run kubectl describe pod <pod name> -n <namespace>
. The max value I can use is 1.
How this is supposed to work?
Obviously, I read the documentation and searched for different examples, but when I put things in practice I see test results different from the theory. I know I am missing something.
Upvotes: 1
Views: 6791
Reputation: 286
this is a great question and it also took me some time earlier this year to find out by experience.
It is important to understand that request have no actual effect on the resource usage of containers. You can check by connecting to your Server and running htop
or kubectl top
like you did, and you see that even though you defined requests: memory: 64Gi
only 244Mi are used.
The main purpose of requests is to influence scheduling behavior. When the Kubernetes Scheduler looks for a fitting Node to place a new Pod on it, it checks for the currently requested CPU and Memory of the Nodes. You can check current status of nodes yourself by running the following command.
$ kubectl describe node worker01
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits
-------- -------- ------
cpu 200m (10%) 1100m (55%)
memory 506Mi (13%) 2098Mi (54%)
ephemeral-storage 0 (0%) 0 (0%)
hugepages-1Gi 0 (0%) 0 (0%)
hugepages-2Mi 0 (0%) 0 (0%)
If the Requests (of either CPU or Memory) would exceed 100%, then the Pod cant be scheduled and goes into a Pending State.
Setting the correct Requests can be quite tricky, if you set them to high, you wont efficiently use the resources of your node, as you cant schedule that many pods and if you set them to low you are in danger of having Applications constantly crash or throttle during Performance Peaks.
The main purpose of limits is to control the max Resource usage of Pods.
Because CPU can be compressed, Kubernetes will make sure your containers get the CPU they requested and will throttle the rest. Memory cannot be compressed, so Kubernetes needs to start making decisions on what containers to terminate if the Node runs out of memory[1]
So if a Container exceeds its Limit, it gets terminated or throttled. This led to the best practice in my company to not put limits on databases in our Cluster.
The referenced blogs post helped me to get some good insights:
[1] https://cloud.google.com/blog/products/containers-kubernetes/kubernetes-best-practices-resource-requests-and-limits
[2] https://sysdig.com/blog/kubernetes-limits-requests/
Upvotes: 5