alt-f4
alt-f4

Reputation: 2326

What are the steps that I need to take after installing Airflow's Helm chart?

I am still learning about Airflow. My objective is to deploy Airflow on Kubernetes (I am using AWS EKS) with CeleryExecutor and using Helm charts. As a start, I am primarily focused on deploying the example dags.

I have installed the chart like:

helm install airflow bitnami/airflow \
    --set airflow.loadExamples=true \
    --set web.baseUrl=http://127.0.0.1:8080 \
    --set auth.username=$AIRFLOW_USER  \
    --set auth.password=$AIRFLOW_PASSWORD \
    --set auth.fernetKey=$AIRFLOW_FERNETKEY = \
    --set postgresql.postgresqlPassword=$POSTGRESQL_PASSWORD \
    --set redis.password=$REDIS_PASSWORD 

I have been successful in deploying airflow (using the bitnami chart but also had the same issue with other charts), and by running kubectl get pods I get:

NAME                                                 READY   STATUS    RESTARTS   AGE
bitnami-release-airflow-scheduler-774d647447-j6vpd   1/1     Running   0          4m6s
bitnami-release-airflow-web-5897c99754-hq6nr         1/1     Running   0          4m6s
bitnami-release-airflow-worker-0                     0/1     Running   0          4m6s
bitnami-release-postgresql-0                         1/1     Running   0          4m6s
bitnami-release-redis-master-0                       1/1     Running   0          4m6s

I have port-forwarded from the web server's pod to my machine, however, when I try to run the example DAGs, the tasks seem to be stuck in running. I have later found an error on the UI: The scheduler does not appear to be running. The last heartbeat was received 1 minute ago. which seems to me that the scheduler is down.

Things I have tried:

  1. Manually running init db on the webserver via: kubectl exec -it bitnami-release-airflow-web-5897c99754-hq6nr -- bash -c "airflow initdb" (was successful but did not solve the problem)
  2. Manually running airflow scheduler via: kubectl exec -it bitnami-release-airflow-scheduler-774d647447-j6vpd -- bash -c "airflow scheduler (Tasks started getting added to the queue but then got: ERROR - Process timed out, PID: 5957
  3. Trying a different Helm chart altogether (tried the airflow-helm chart but had the same issue with tasks stuck in running state)

What are the steps that I need to take after installing the Helm chart? Am I required to run commands to start services manually (e.g. the scheduler)?

Happy to provide further information if needed.

Update: Adding a snippet from the scheduler's logs:

[2020-12-19 11:23:10,487] {scheduler_job.py:1195} INFO - Sending ('example_bash_operator', 'runme_2', datetime.datetime(2020, 12, 19, 11, 15, 41, 826557, tzinfo=<TimezoneInfo [UTC, GMT, +00:00:00, STD]>), 1) to executor with priority 3 and queue default
[2020-12-19 11:23:10,487] {base_executor.py:58} INFO - Adding to queue: ['airflow', 'run', 'example_bash_operator', 'runme_2', '2020-12-19T11:15:41.826557+00:00', '--local', '--pool', 'default_pool', '-sd', '/opt/bitnami/airflow/venv/lib/python3.6/site-packages/airflow/example_dags/example_bash_operator.py']
[2020-12-19 11:23:12,500] {timeout.py:42} ERROR - Process timed out, PID: 5957

Upvotes: 1

Views: 1136

Answers (1)

alt-f4
alt-f4

Reputation: 2326

I have managed to solve the issue. One of the nodes was running out of memory, and I thought it was an issue with airflow configurations rather than an issue with Kubernetes.

Upvotes: 1

Related Questions