Reputation: 2605
I have a small trouble with python path while starting a airflow DAG. I am very new to Airflow.
I have my directories as follows: I have specified init.py inside every directory
-- ProjectA
|-- code
|__ module1.py
|__ __init__.py
|
|-- dags
|__ crawler.py # Contains the bash operator to run a python module
|__ __init__.py
|
|-- jobs
|__ python_module.py # Contains a function that makes call to module1.py (contains the code to crawl websites) present inside Code package
|__ __init__.py
|
|-- logs
|
|-- __init__.py
and other Airflow files
My Dag implementation for BashOperator is given below.
from datetime import datetime, timedelta
from airflow import DAG
from airflow.operators.bash_operator import BashOperator
default_args = {
'owner': 'Sam',
'depends_on_past': False,
'start_date': datetime(2018, 4, 26),
'email': ['[email protected]'],
'email_on_failure': False,
'email_on_retry': False,
'retries': 1,
'retry_delay': timedelta(minutes=2)
}
dag = DAG('dump_aerial_image', default_args=default_args)
t1 = BashOperator(
task_id='AerialDUMP',
bash_command='python /Users/sam/App/ProjectA/jobs/python_module.py',
dag=dag
)
When I run the DAG from the Airflow UI. I get the following error
ImportError: No module named 'code.module1'; 'code' is not a package
[2018-04-25 18:45:07,699] {base_task_runner.py:98} INFO - Subtask: [2018-04-25 18:45:07,698] {bash_operator.py:105} INFO - Command exited with return code 1
[2018-04-25 18:45:07,707] {models.py:1595} ERROR - Bash command failed
Traceback (most recent call last):
File "/Users/sam/App-Setup/anaconda/envs/anaconda35/lib/python3.5/site- packages/airflow/models.py", line 1493, in _run_raw_task
result = task_copy.execute(context=context)
File "/Users/sam/App-Setup/anaconda/envs/anaconda35/lib/python3.5/site-packages/airflow/operators/bash_operator.py", line 109, in execute
raise AirflowException("Bash command failed")
airflow.exceptions.AirflowException: Bash command failed
Not sure how to check this error. I even tried adding /Users/sam//App/ProjectA to my python path. Even this diet work. My Python path looks like
['/Users/sam/App/ProjectA/dags', '/Users/sam/App/ProjectA', '/Users/sam/App-Setup/anaconda/envs/anaconda35/lib/python35.zip', '/Users/sam/App-Setup/anaconda/envs/anaconda35/lib...........]
Not sure how to overcome this situation, any help will be appriciated.
Upvotes: 1
Views: 1640
Reputation: 51
try running python /Users/sam/App/ProjectA/jobs/python_module.py. if you get the same error, this is a python issue, not airflow.
are you using celery executor? in that case, does the env variable defined in all executors?
In any case, you can try adding for python_module.py sys.path.append
Upvotes: 1
Reputation: 8249
There is a lot going on here.
code.module1
, but an import error occurs in the logBashOperator
is being used to execute a Python task, not PythonOperator
So I'd suggest:
PythonOperator
and use the imported function - unless I'm not seeing the reason why this would need to be run with a BashOperator
Upvotes: 1