vpram86
vpram86

Reputation: 6010

GKE autoscaling

I have three node pools in my cluster each of them have autoscaling enabled to go from 1-100 nodes. Minimum nodes are 1 for all. I am having something weird happening with autoscaling.

Scale down works fine for all pools. Scale up seems to create a new node pool instead of scaling the corresponding node pools but since that node pool is missing the labels we need nothing gets scheduled and eventually gets destroyed.

I swear I am missing some information to enable it to scale the right node-pool, Any suggestions on what to look at and where to change? I do not use/have GCE auto-scaling

Upvotes: 0

Views: 872

Answers (2)

kulehandluke
kulehandluke

Reputation: 1

This can also be because you've enabled 'Node auto-provisioning' on the cluster. The naming seems to imply that it provisions nodes but if you read the blurb along side it, it actually auto-provisions node pools.

Node auto-provisioning automatically manages a set of node pools on the user's behalf. Without node auto-provisioning, GKE considers starting new nodes only from the set of user created node pools. With node auto-provisioning, new node pools can be created and deleted automatically. Learn more

What you probably want to do instead is edit each of your node pools and enable "cluster auto-scaler", set the node pool limits and zones you require.

Upvotes: 0

Srividya
Srividya

Reputation: 2323

GKE starts new nodes only from user-created node pools. With Node auto-provisioning enabled, the cluster autoscaler can extend node pools automatically. Node auto-provisioning automatically manages a set of node pools on the user's behalf. Since the nodepools here don't have labels, Node auto-provisioning is creating the new nodepools with required labels.

Node auto-provisioning might create node pools with labels and taints if all the following conditions are met:

  • A pending Pod requires a node with a specific label key and value.
  • The Pod has a toleration for a taint with the same key.
  • The toleration is for the NoSchedule effect, NoExecute effect, or all effects.

You can update node labels and node taints for the existing nodepools by disabling the autoscaling on the node pool. After the labels or taints are updated, re-enable autoscaling.

To update node labels for a existing node pool, use the following command:

gcloud beta container node-pools update NODEPOOL_NAME \
        --node-labels=[NODE_LABEL,...] \
        [--cluster=CLUSTER_NAME] [--region=REGION | --zone=ZONE]
        [GCLOUD_WIDE_FLAG …]

Note: The cluster autoscaler is automatically enabled when using node auto-provisioning.

Refer to Node auto-provisioning for more information.

Upvotes: 2

Related Questions