Lets assume that I run K8s in AWS on a node with 2 vCPUs. I would like to understand what are the best practices about pods amount vs requested CPU. For example, let`s use these 2 scenarios: I can set resources.requests.cpu = 1000m with maxReplicas = 2 and it will use the whole available CPUs: 1000m*2 = 2 vCPUs. I can set resources.requests.cpu = 100m with maxReplicas = 20 and it will also use the whole available CPUs: 100m*20 = 2 vCPUs In which scenario my system will work faster? It is better to plan more pods amount with small CPU requests or it is better to plan small amount of pods with big CPU requests? Are there any recommendation/guidelines or rather any time performance tests should be run to identify optimal configuration?

Reputation: 31

Kubernetes amount of Pods vs amount of CPU requests

Lets assume that I run K8s in AWS on a node with 2 vCPUs. I would like to understand what are the best practices about pods amount vs requested CPU.

For example, let`s use these 2 scenarios:

I can set resources.requests.cpu = 1000m with maxReplicas = 2 and it will use the whole available CPUs: 1000m*2 = 2 vCPUs.
I can set resources.requests.cpu = 100m with maxReplicas = 20 and it will also use the whole available CPUs: 100m*20 = 2 vCPUs

In which scenario my system will work faster? It is better to plan more pods amount with small CPU requests or it is better to plan small amount of pods with big CPU requests? Are there any recommendation/guidelines or rather any time performance tests should be run to identify optimal configuration?

Upvotes: 0

Answers (1)

David Maze

Reputation: 159998

This depends heavily on your application and its characteristics; it's impossible to say which setup will run faster without actually profiling it.

Running many very CPU-constrained pods will only help you if:

An individual pod isn't using that much CPU to start with (you'll always be constrained by the physical hardware)
The application isn't especially multi-threaded or otherwise concurrent (or else you could launch more threads/goroutines/... to do more work)
You're not performance-bound by a remote database or something similar (your application can run up to the database's speed but no faster)
The base memory overhead of the pods also fits on the node

In the situation you describe, setting very low resource limits like this will help you squeeze more pods on to the one small system. If all of the pods do decide to start doing work, though, they'll still be limited by the physical CPU (and things won't run faster; probably marginally slower for the system having more scheduling overhead). If you also run out of physical memory then the pods will start getting OOMKilled, which will probably not be good for your overall performance at all.

If you can figure out some axis where every request requires some consistent amount of resource and adding more of that resource will let you process more concurrent requests, then setting up a HorizontalPodAutoscaler could be a good match. However, HPA's general assumption is that it can always create more pods up to its configured limit, and the cluster will expand to hold them (via a cluster autoscaler); that doesn't necessarily fit your scenario with having only a single small machine to work with.

Upvotes: 2

Kubernetes amount of Pods vs amount of CPU requests

Answers (1)

Related Questions