Cr4zyTun4
Cr4zyTun4

Reputation: 765

EKS: Why autoscaling-down is triggered?

I have a EKS cluster and different nodes are marked for DeletionCandidateOfClusterAutoscaler and later deleted with ToBeDeletedByClusterAutoscaler.

When checking in the cluster-autoscaler logs it seems the cluster has received a scale-down input.

[1 scale_down.go:443] Node XXXX is not suitable for removal - cpu utilization too big (0.570164)

[1 scale_down.go:443] Node XXXX is not suitable for removal - cpu utilization too big (0.823009)

[1 scale_down.go:443] Node XXXX is not suitable for removal - memory utilization too big (0.951696)

[1 scale_down.go:443] Node XXXX is not suitable for removal - cpu utilization too big (0.595449)

[1 scale_down.go:443] Node XXXX - cpu utilization 0.154814

How to understand what is the trigger of such a scale-down event?

In the autoscaling-groups events section in AWS nothing is there and I haven't changed the desired number of nodes. No policy is set. The desired and the actual state match

Upvotes: 0

Views: 673

Answers (1)

Marco Caberletti
Marco Caberletti

Reputation: 1926

From the EKS documentation:

The Kubernetes Cluster Autoscaler automatically adjusts the number of nodes in your cluster when pods fail or are rescheduled onto other nodes.

Basically, when the Cluster Autoscaler:

  • evaluate that one or more nodes are underused
  • there are free resources on the other nodes
  • some pods can be moved from underused nodes to another node (compatibly with pods resource requirements)

in this case, it drains the underused nodes and removes them.
You can customize the threshold with the argument scale-down-utilization-threshold or even disable the scale down with scale-down-enabled: false

Upvotes: 1

Related Questions