Reputation: 702
I created a very simple DAG to execute a Python file using PythonOperator. I'm using docker image to run Airflow but it doesn't recognize a module where I have my .py file
The structure is like this:
main_dag.py
plugins/__init__.py
plugins/njtransit_scrapper.py
plugins/sql_queries.py
plugins/config/config.cfg
cmd to run docker airflow image:
docker run -p 8080:8080 -v /My/Path/To/Dags:/usr/local/airflow/dags puckel/docker-airflow webserver
I already tried airflow initdb
and restarting the web server but it keeps showing the error ModuleNotFoundError: No module named 'plugins'
For the import statement I'm using:
from plugins import njtransit_scrapper
This is my PythonOperator:
tweets_load = PythonOperator(
task_id='Tweets_load',
python_callable=njtransit_scrapper.main,
dag=dag
)
My njtransit_scrapper.py file is just a file that collects all tweets for a tweeter account and saves the result in a Postgres database.
If I remove the PythonOperator code and imports the code works fine. I already test almost everything but I'm not quite sure if this is a bug or something else.
It's possible that when I created a volume for the docker image, it's just importing the main dag and stopping there causing to not import the entire package?
Upvotes: 7
Views: 7259
Reputation: 5254
To help others who might land on this page and get this error because of the same mistake I did, I will record it here.
I had an unnecessary __init__.py
file in dags/
folder.
Removing it solved the problem, and allowed all the dags to find their dependency modules.
Upvotes: 4