aviral sanjay
aviral sanjay

Reputation: 983

BashOperator raising ImportError for a lib used in other PythonOperators

I have a set of tasks in my dag builder module which uses Python operator as used worldwide in Airflow. I am deploying airflow using docker on kubernetes.

A task is failing with the error message: no module named pandas. The other tasks using pandas are successful.

Yes, I did enter the container(workers) and found out that pip3 freeze does show up pandas.

2018-12-13 12:30:23,332] {bash_operator.py:87} INFO - Temporary script location: /tmp/airflowtmppkovwfth/pscript_pclean_zjg4qfamp9pda9jsxysyrqfj_AWFtK5ucowyw2
[2018-12-13 12:30:23,333] {bash_operator.py:97} INFO - Running command: python /usr/local/airflow/rootfs/mopng_baseline_v2/scripts/pclean_zjg4qfamp9pda9jsxysyrqfj_AWFtK.py /usr/local/airflow/rootfs/mopng_baseline_v2/scheduled__2018-12-12T14:00:00+00:00/appended/DsDnV0TjSHnL0DF53JLjmUtO.csv /usr/local/airflow/rootfs/mopng_baseline_v2/scheduled__2018-12-12T14:00:00+00:00/pcleaned/ztYVV9nkh5t425gYjFqKuAD9.csv
[2018-12-13 12:30:23,344] {bash_operator.py:106} INFO - Output:
[2018-12-13 12:30:23,359] {bash_operator.py:110} INFO - Traceback (most recent call last):
[2018-12-13 12:30:23,359] {bash_operator.py:110} INFO -   File "/usr/local/airflow/rootfs/mopng_baseline_v2/scripts/pclean_zjg4qfamp9pda9jsxysyrqfj_AWFtK.py", line 3, in <module>
[2018-12-13 12:30:23,359] {bash_operator.py:110} INFO -     import pandas as pd
[2018-12-13 12:30:23,360] {bash_operator.py:110} INFO - ImportError: No module named pandas
[2018-12-13 12:30:23,362] {bash_operator.py:114} INFO - Command exited with return code 1
[2018-12-13 12:30:23,383] {models.py:1736} ERROR - Bash command failed
Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/airflow/models.py", line 1633, in _run_raw_task
    result = task_copy.execute(context=context)
  File "/usr/local/lib/python3.5/dist-packages/airflow/operators/bash_operator.py", line 118, in execute
    raise AirflowException("Bash command failed")
airflow.exceptions.AirflowException: Bash command failed

Upvotes: 2

Views: 923

Answers (1)

villasv
villasv

Reputation: 6861

The Operator failing is not a PythonOperator, it's a BashOperator. The most likely reason is that python in Bash is currently pointing to a different Python environment from the one running Airflow.

Be sure to specify python3 in your BashOperator, or whatever extra configuration you need to invoke Python from the command line in the same environment as your PythonOperator does.

Upvotes: 1

Related Questions