Reputation: 31
Lets assume that I run K8s in AWS on a node with 2 vCPUs. I would like to understand what are the best practices about pods amount vs requested CPU.
For example, let`s use these 2 scenarios:
I can set resources.requests.cpu = 1000m with maxReplicas = 2 and it will use the whole available CPUs: 1000m*2 = 2 vCPUs.
I can set resources.requests.cpu = 100m with maxReplicas = 20 and it will also use the whole available CPUs: 100m*20 = 2 vCPUs
In which scenario my system will work faster? It is better to plan more pods amount with small CPU requests or it is better to plan small amount of pods with big CPU requests? Are there any recommendation/guidelines or rather any time performance tests should be run to identify optimal configuration?
Upvotes: 0
Views: 2027
Reputation: 159998
This depends heavily on your application and its characteristics; it's impossible to say which setup will run faster without actually profiling it.
Running many very CPU-constrained pods will only help you if:
In the situation you describe, setting very low resource limits like this will help you squeeze more pods on to the one small system. If all of the pods do decide to start doing work, though, they'll still be limited by the physical CPU (and things won't run faster; probably marginally slower for the system having more scheduling overhead). If you also run out of physical memory then the pods will start getting OOMKilled, which will probably not be good for your overall performance at all.
If you can figure out some axis where every request requires some consistent amount of resource and adding more of that resource will let you process more concurrent requests, then setting up a HorizontalPodAutoscaler could be a good match. However, HPA's general assumption is that it can always create more pods up to its configured limit, and the cluster will expand to hold them (via a cluster autoscaler); that doesn't necessarily fit your scenario with having only a single small machine to work with.
Upvotes: 2