Ary Jazz
Ary Jazz

Reputation: 1656

Cloud Composer (Airflow) jobs stuck

My Cloud Composer managed Airflow got stuck for hours since I've canceled a Task Instance that was taking too long (Let's call it Task A)

I've cleared all the DAG Runs and task instances, but there are a few jobs running and one job with Shutdown state (I suppose the job of Task A) (snapshot of my Jobs).

Besides, it seems that the scheduler is not running since recently deleted DAGs keep appearing in the dashboard

Is there a way to kill the jobs or reset the scheduler? Any idea to un-stuck the composer will be welcomed.

Upvotes: 7

Views: 8773

Answers (2)

ch_mike
ch_mike

Reputation: 1576

You can restart the scheduler as follows:

From your cloud shell:

1.Determine your environment’s Kubernetes cluster:

gcloud composer environments describe ENVIRONMENT_NAME \
    --location LOCATION 

2.Get credentials and connect to the Kubernetes cluster:

gcloud container clusters get-credentials ${GKE_CLUSTER} --zone ${GKE_LOCATION}

3.Run the following command to restart the scheduler:

kubectl get deployment airflow-scheduler -o yaml | kubectl replace --force -f -

Steps 1 and 2 are detailed here. Step 3 basically replaces the “airflow-scheduler” deployment with itself, thus restarting the service.

If restarting the scheduler doesn’t help you may as well need to recreate your Composer Environment and Troubleshoot your DAGs if this happens every time.

Upvotes: 8

Feng Lu
Feng Lu

Reputation: 691

Which version of Composer are you running? It's a known issue that jobs may get stuck for beta versions. Composer 1.0.0 and 1.1.0 should not see any stuck jobs (except tasks in SubDag, which is a known Airflow bug), consider migrating to the latest Composer version.

Upvotes: 0

Related Questions