user9040429
user9040429

Reputation: 720

Not able to use JdbcOperator Airflow

I am trying to connect to hive table using JdbcOperator. My code is below :

import datetime as dt
from datetime import timedelta

import airflow
from airflow.models import DAG
from airflow.operators.jdbc_operator.JdbcOperator import JdbcOperator

args = {
    'owner': 'Airflow',
    'start_date': dt.datetime(2020, 3, 24),
    'retries': 1,
    'retry_delay': dt.timedelta(minutes=5),
}

dag_hive = DAG(dag_id="import_hive",default_args=args, schedule_interval= " 0 * * * *",dagrun_timeout=timedelta(minutes=60))
hql_query = """USE testdb;
CREATE TABLE airflow-test-table LIKE testtable;"""
hive_task = JdbcOperator(sql = hql_query, task_id="hive_script_task", jdbc_conn_id="hive_conn_default",dag=dag_hive)

hive_task

I am getting error

ModuleNotFoundError: No module named 'airflow.operators.jdbc_operator.JdbcOperator'; 'airflow.operators.jdbc_operator' is not a package

I have cross checked the package in sitepackages folder, its available. Not able to figure out why I am getting this error.

Upvotes: 0

Views: 904

Answers (2)

kaxil
kaxil

Reputation: 18884

Install dependencies for using JDBC operator by running the following command:

pip install 'apache-airflow[jdbc]'

and then import JdbcOperator in your DAG file like @mk_sta mentioned and as follows:

from airflow.operators.jdbc_operator import JdbcOperator

Upvotes: 2

Nick_Kh
Nick_Kh

Reputation: 5253

The correct way to import JdbcOperator() module will be the following:

from airflow.operators.jdbc_operator import JdbcOperator

Keep in mind that JDBCOperator also requires dependent jaydebeapi Python package that needs to be supplied to the current Airflow environment.

Upvotes: 1

Related Questions