rodrigo-silveira
rodrigo-silveira

Reputation: 13088

Scale Google Cloud Composer cluster down to zero nodes?

I have a Cloud Composer cluster running about a dozen dags a day. They all run during a 5 hour period in the middle of the night. The biggest DAG takes ~3 hours to complete running on 5 nodes, and the bulk of the work is highly parallelizable (that is, if we scale it up to, say, 15 nodes, it'd finish way sooner). In an effort to both keep costs low (or possibly reduce it), and improve our throughput, it'd be great if I could scale the cluster up when the big DAG is running, then scale it back down for the remaining almost 20 hours in the day when nothing is happening in the cluster. Using the UI, it only lets me scale down the cluster to 3 nodes.

My question: Is there a way to completely "shut down" the Cloud Composer cluster for part of the day? If anything, can I at least bring it own to a single node? Ideally, this would be an automated task.

Upvotes: 2

Views: 1925

Answers (3)

Anders Elton
Anders Elton

Reputation: 871

Cloud composer also has costs you cannot do anything with

  • frontend (appengine flex)
  • database

These costs are a significant part of a small composer cluster.

If you want to scale down to 0, I suggest running airflow on a VM instead of a managed composer environment. After the airflow has completed its run, you can shut down the VM to reduce costs.

GKE (that runs composer), cannot scale down to 0 nodes, as it also running some kubernetes services that needs cpu & ram to run on.

Other than that you should check out the link posted by SANN3, as that posts gives some detailed insight in how to achieve autoscaling.

Upvotes: 2

SANN3
SANN3

Reputation: 10099

The same problem is solved by traveloka team and written a detailed article about the process. But in the idle case they are running 1 node not zero.

https://medium.com/traveloka-engineering/enabling-autoscaling-in-google-cloud-composer-ac84d3ddd60

Upvotes: 3

irvifa
irvifa

Reputation: 2083

You can enable Autoscaling in the Node level:

Workloads > your composer cluster name > enable Autoscaling
PROJECT=[provide your gcp project id]
COMPOSER_NAME=[provide your composer environment name]
COMPOSER_LOCATION=[provide the selected composer’s location e.g. us-central]
CLUSTER_ZONE=[provide the selected composer’s zone e.g. us-central1-a]
GKE_CLUSTER=$(gcloud composer environments describe \
${COMPOSER_NAME} \
--location ${COMPOSER_LOCATION} \
--format="value(config.gkeCluster)" \
--project ${PROJECT} | \
grep -o '[^\/]*$')
gcloud container clusters update ${GKE_CLUSTER} --enable-autoscaling \
--min-nodes 1 \
--max-nodes 10 \
--zone ${CLUSTER_ZONE} \
--node-pool=default-pool \
--project ${PROJECT}

For worker level, we are going to apply Kubernetes’ Horizontal Pod Autoscaler (HPA) to airflow-worker Deployment in Composer.

Upvotes: 1

Related Questions