Reputation: 765
I have a EKS cluster and different nodes are marked for DeletionCandidateOfClusterAutoscaler and later deleted with ToBeDeletedByClusterAutoscaler.
When checking in the cluster-autoscaler logs it seems the cluster has received a scale-down input.
[1 scale_down.go:443] Node XXXX is not suitable for removal - cpu utilization too big (0.570164)
[1 scale_down.go:443] Node XXXX is not suitable for removal - cpu utilization too big (0.823009)
[1 scale_down.go:443] Node XXXX is not suitable for removal - memory utilization too big (0.951696)
[1 scale_down.go:443] Node XXXX is not suitable for removal - cpu utilization too big (0.595449)
[1 scale_down.go:443] Node XXXX - cpu utilization 0.154814
How to understand what is the trigger of such a scale-down event?
In the autoscaling-groups events section in AWS nothing is there and I haven't changed the desired number of nodes. No policy is set. The desired and the actual state match
Upvotes: 0
Views: 673
Reputation: 1926
From the EKS documentation:
The Kubernetes Cluster Autoscaler automatically adjusts the number of nodes in your cluster when pods fail or are rescheduled onto other nodes.
Basically, when the Cluster Autoscaler:
in this case, it drains the underused nodes and removes them.
You can customize the threshold with the argument scale-down-utilization-threshold
or even disable the scale down with scale-down-enabled: false
Upvotes: 1