KVSEA
KVSEA

Reputation: 109

How to run airflow scheduler and webserver simultaneously

I'm trying to run Airflow 2 locally with a postgres db (localhost). I can get the webserver running, however I can't get the scheduler to run at the same time as the webserver. Running airflow scheduler :

____    |__( )_________  __/__  /________      __
____  /| |_  /__  ___/_  /_ __  /_  __ \_ | /| / /
___  ___ |  / _  /   _  __/ _  / / /_/ /_ |/ |/ /
 _/_/  |_/_/  /_/    /_/    /_/  \____/____/|__/
[2022-08-27 16:10:50,543] {scheduler_job.py:709} INFO - Starting the scheduler
[2022-08-27 16:10:50,544] {scheduler_job.py:714} INFO - Processing each file at most -1 times
[2022-08-27 16:10:50 -0500] [48113] [INFO] Starting gunicorn 20.1.0
[2022-08-27 16:10:50,546] {executor_loader.py:105} INFO - Loaded executor: SequentialExecutor
[2022-08-27 16:10:50 -0500] [48113] [INFO] Listening at: http://[::]:8793 (48113)
[2022-08-27 16:10:50 -0500] [48113] [INFO] Using worker: sync
[2022-08-27 16:10:50,550] {manager.py:160} INFO - Launched DagFileProcessorManager with pid: 48114
[2022-08-27 16:10:50,552] {scheduler_job.py:1231} INFO - Resetting orphaned tasks for active dag runs
[2022-08-27 16:10:50 -0500] [48115] [INFO] Booting worker with pid: 48115
[2022-08-27 16:10:50,556] {settings.py:55} INFO - Configured default timezone Timezone('UTC')
[2022-08-27T16:10:50.567-0500] {manager.py:406} WARNING - Because we cannot use more than 1 thread (parsing_processes = 2) when using sqlite. So we set parallelism to 1.
[2022-08-27 16:10:50 -0500] [48116] [INFO] Booting worker with pid: 48116
[2022-08-27 16:15:50,663] {scheduler_job.py:1231} INFO - Resetting orphaned tasks for active dag runs
[2022-08-27 16:20:50,749] {scheduler_job.py:1231} INFO - Resetting orphaned tasks for active dag runs
[2022-08-27 16:25:50,834] {scheduler_job.py:1231} INFO - Resetting orphaned tasks for active dag runs
[2022-08-27 16:30:50,911] {scheduler_job.py:1231} INFO - Resetting orphaned tasks for active dag runs
[2022-08-27 16:35:50,991] {scheduler_job.py:1231} INFO - Resetting orphaned tasks for active dag runs
[2022-08-27 16:40:51,064] {scheduler_job.py:1231} INFO - Resetting orphaned tasks for active dag runs

I can run the db, scheduler, and webserver using airflow standalone, however my understanding is this practice is really just for development, not for production so I want to avoid this. When initializing the database I have no issues. However when I go to the webserver UI, it will signal that no scheduler is running. Then I am required to kill the UI to run airflow scheduler from the CLI. Now per the code above, there isn't a point where control is returned back to my terminal from the scheduler without killing the scheduler, meaning I can't get back to the webserver UI. How can I then run the scheduler and run the webserver simultaneously without killing either process for the other?

Upvotes: 0

Views: 2372

Answers (1)

Hussein Awala
Hussein Awala

Reputation: 5096

Airflow has multiple core components, like wbeserver and scheduler, these components run in separate processes, when you run airflow standalone, Airflow runs the webserver, the scheduler and the triggerer (a process which supports deferrable operators) in 3 processes (check the source code).

If you want to run them manually, you should run each service in a separate terminal or run them in background:

airflow scheduler &
airflow webserver &

Upvotes: 1

Related Questions