Can we scale pods in/out in fixed chunks using HPA in Kubernetes?

Question

I have a web application hosted in EKS and there is a matrix in place for CPU utilization for scaling the pods horizontally.

If the current number of pods is 10, and I increase the load (increasing requests per minute) then the desired number of pods is dependent on how aggressively I am increasing the load, so it could be 13, 16 etc.

But I want that the number of pods should always increase in a multiple of 5 and decrease in a multiple of 3. Is this possible?

moonkotte · Accepted Answer

Went through documentation and some code, this looks impossible to force horizontal pod autoscaler (HPA) to scale down or up in exact numbers of pods since there's no flags/options for it.

The closest you can get is to set up scaleDown and scaleUp policies.

Below the example (note, this will work with v2beta2 api version), this part should be located under spec:

behavior:
  scaleDown:
    stabilizationWindowSeconds: 300
    policies:
    - type: Pods
      value: 3
      periodSeconds: 15
  scaleUp:
    stabilizationWindowSeconds: 0
    policies:
    - type: Pods
      value: 5
      periodSeconds: 15

What this means:

scaleDown will be performed at most of 3 pods every 15 seconds.
scaleUp will be performed at most of 5 pods every 15 seconds.
stabilizationWindowSeconds - The stabilization window is used to restrict the flapping of replicas when the metrics used for scaling keep fluctuating. The stabilization window is used by the autoscaling algorithm to consider the computed desired state from the past to prevent scaling

This doesn't guarantee that HPA will scale up or down the exact number of specified pods, it's just a policy. However if workload increase or decrease will happen fast, it should be close to behaviour you'd like to see.

Useful link:

Support for configurable scaling behavior

Can we scale pods in/out in fixed chunks using HPA in Kubernetes?

Answers (1)

Related Questions