Jesse Shieh
Jesse Shieh

Reputation: 4850

How to configure Kubernetes cluster autoscaler to scale down only?

I'd like to run the kubernetes cluster autoscaler so that unneeded nodes will be removed automatically, but I don't want the autoscaler to add nodes automatically. I prefer to handle scaling up myself. Is this possible?

I found maxNodesTotal, but I worry the semantics of setting this to 0 might mean all my nodes will go away. I also found scaleDownEnabled, but no corresponding option for scaling up.

Upvotes: 1

Views: 1204

Answers (2)

dlaidlaw
dlaidlaw

Reputation: 1803

Since you tagged this question with EKS, I will assume you are on AWS. On AWS the ASG (Auto Scaling Group) for each NodeGroup has a Max setting that is honoured by the cluster autoscaler. You can set this to prevent scaling above the set number of nodes. If the Min and Max on the ASG are the same value, then the autoscaler will never scale up or down. If the Min and Max are different, then the autoscaler can scale both up and down between those number of nodes. This is not exactly "never scale up", but it limits the upper end.

If you have multiple NodeGroups (ASGs), then each one can have different Min and Max nodes values.

You can also configure the cluster autoscaler itself in different ways. For example, you can set the utilization threshold. If a node's utilization fall under this threshold then the cluster autoscaler considers the node for scale down. See the FAQ.

The FAQ entry above that one may also apply. You can add an annotation to any node you do not want considered for scale down by the cluster autoscaler. Set: kubectl annotate node <nodename> cluster-autoscaler.kubernetes.io/scale-down-disabled=true or annotate the nodes as they are created. You can do this with entries in your AWS node group setup.

Upvotes: 1

Niron Koren
Niron Koren

Reputation: 46

Kubernetes Cluster Autoscaler or CA will attempt scale up whenever it will identify pending pods waiting to be scheduled to run but request more resources(CPU/RAM) than any available node can serve.

You can use the parameter maxNodeTotal to limit the maximum number of nodes CA would be allowed to spin up.

For example if you don't want your cluster to consist of any more than 3 nodes during peak utlization than you would set maxNodeTotal to 3.

There are different considerations that you should be aware of in terms of cost savings, performance and availability.

I would try to list some related to cost savings and efficient utilization as I suspect you might be more interested in that aspect. Make sure you size your pods in consistency to their actual utlization, because scale up would get triggered by Pods resource request and not actual Pod resource utilization. Also, bigger Pods are less likely to fit together on the same node, and in addition CA won't be able to scale down any semi-utilised nodes, resulting in resource spending.

Upvotes: 1

Related Questions