Reputation: 720
I am trying to connect to hive table using JdbcOperator. My code is below :
import datetime as dt
from datetime import timedelta
import airflow
from airflow.models import DAG
from airflow.operators.jdbc_operator.JdbcOperator import JdbcOperator
args = {
'owner': 'Airflow',
'start_date': dt.datetime(2020, 3, 24),
'retries': 1,
'retry_delay': dt.timedelta(minutes=5),
}
dag_hive = DAG(dag_id="import_hive",default_args=args, schedule_interval= " 0 * * * *",dagrun_timeout=timedelta(minutes=60))
hql_query = """USE testdb;
CREATE TABLE airflow-test-table LIKE testtable;"""
hive_task = JdbcOperator(sql = hql_query, task_id="hive_script_task", jdbc_conn_id="hive_conn_default",dag=dag_hive)
hive_task
I am getting error
ModuleNotFoundError: No module named 'airflow.operators.jdbc_operator.JdbcOperator'; 'airflow.operators.jdbc_operator' is not a package
I have cross checked the package in sitepackages folder, its available. Not able to figure out why I am getting this error.
Upvotes: 0
Views: 904
Reputation: 18884
Install dependencies for using JDBC operator by running the following command:
pip install 'apache-airflow[jdbc]'
and then import JdbcOperator
in your DAG file like @mk_sta mentioned and as follows:
from airflow.operators.jdbc_operator import JdbcOperator
Upvotes: 2
Reputation: 5253
The correct way to import JdbcOperator() module will be the following:
from airflow.operators.jdbc_operator import JdbcOperator
Keep in mind that JDBCOperator
also requires dependent jaydebeapi
Python package that needs to be supplied to the current Airflow environment.
Upvotes: 1