Reputation: 315
I am trying to wrap my head around how DAGs should actually be executed in docker compose environment, when they are dependent on other service (separate python venv) defined in the compose file.
I have setup airflow via docker compose as mentioned in official documentation. Also, I have added a Django service, which has its own dependencies.
Now, i would like to have a DAG, that executes python script using that Django's service python environment (It also uses Django's models. Not sure if that's relevant).
The only way I see it working is with DockerOperator as described here. I managed to setup and execute the test DAG mentioned there, however when I try to run the real task, it fails due to networking issues. Iam quite confident i can solve that issue, but setting everything this way just seems like way too much hassle.
So, in the end I guess Iam wondering what the ideal architecture should when using Airflow via compose? Should the base airflow image be extended with my Django service (creating one hell of a big image) or is there a better way?
Upvotes: 1
Views: 572
Reputation: 20097
You can use PythonVirtualEnvOperator (https://airflow.apache.org/docs/apache-airflow/stable/howto/operator/python.html#pythonvirtualenvoperator) - but it will recreate django virtualenv every time task is run so not ideal
Another option will be to run DockerOperator or KubernetesPodOperator (if you use Kubernetes) and have separate image with Django installed (or even base Django image).
Adding Django to Airflow is probably not the best idea - Airflow has ~500 dependencies when installed with all providers, so chance is that you will have some difficults-to-resolve conflicts.
Also one of the things we consider for Airlfow 2.2 and beyond is to make better way of handling caching, which could help with building cacheable virtualenv created once and shared between workers/pods (but this is just in discussion phase)
You can check out tomorrow's session on Airflow Summit where we discuss what's coming (and generally Airflow Summit is cool):
https://airflowsummit.org/sessions/2021/looking-ahead-what-comes-after-airflow-2/
Upvotes: 2