Reputation: 983
I have a set of tasks in my dag builder module which uses Python operator as used worldwide in Airflow. I am deploying airflow using docker on kubernetes.
A task is failing with the error message: no module named pandas
. The other tasks using pandas are successful.
Yes, I did enter the container(workers) and found out that pip3 freeze
does show up pandas.
2018-12-13 12:30:23,332] {bash_operator.py:87} INFO - Temporary script location: /tmp/airflowtmppkovwfth/pscript_pclean_zjg4qfamp9pda9jsxysyrqfj_AWFtK5ucowyw2
[2018-12-13 12:30:23,333] {bash_operator.py:97} INFO - Running command: python /usr/local/airflow/rootfs/mopng_baseline_v2/scripts/pclean_zjg4qfamp9pda9jsxysyrqfj_AWFtK.py /usr/local/airflow/rootfs/mopng_baseline_v2/scheduled__2018-12-12T14:00:00+00:00/appended/DsDnV0TjSHnL0DF53JLjmUtO.csv /usr/local/airflow/rootfs/mopng_baseline_v2/scheduled__2018-12-12T14:00:00+00:00/pcleaned/ztYVV9nkh5t425gYjFqKuAD9.csv
[2018-12-13 12:30:23,344] {bash_operator.py:106} INFO - Output:
[2018-12-13 12:30:23,359] {bash_operator.py:110} INFO - Traceback (most recent call last):
[2018-12-13 12:30:23,359] {bash_operator.py:110} INFO - File "/usr/local/airflow/rootfs/mopng_baseline_v2/scripts/pclean_zjg4qfamp9pda9jsxysyrqfj_AWFtK.py", line 3, in <module>
[2018-12-13 12:30:23,359] {bash_operator.py:110} INFO - import pandas as pd
[2018-12-13 12:30:23,360] {bash_operator.py:110} INFO - ImportError: No module named pandas
[2018-12-13 12:30:23,362] {bash_operator.py:114} INFO - Command exited with return code 1
[2018-12-13 12:30:23,383] {models.py:1736} ERROR - Bash command failed
Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/airflow/models.py", line 1633, in _run_raw_task
result = task_copy.execute(context=context)
File "/usr/local/lib/python3.5/dist-packages/airflow/operators/bash_operator.py", line 118, in execute
raise AirflowException("Bash command failed")
airflow.exceptions.AirflowException: Bash command failed
Upvotes: 2
Views: 923
Reputation: 6861
The Operator failing is not a PythonOperator
, it's a BashOperator
. The most likely reason is that python
in Bash is currently pointing to a different Python environment from the one running Airflow.
Be sure to specify python3
in your BashOperator
, or whatever extra configuration you need to invoke Python from the command line in the same environment as your PythonOperator
does.
Upvotes: 1