Rahul Samant
Rahul Samant

Reputation: 73

How to execute sql Script with multiple statements from Airflow OracleOperator

i am trying to call a sql file with multiple statements separated by ; through the OracleOperator in airflow , but its giving below error with multiple statements

E.g File Containing

CALL DROP_OBJECTS('TABLE_XYZ');

CREATE TABLE TABLE_XYZ AS SELECT 1 Dummy from DUAL;

[2019-06-18 18:19:12,582] {init.py:1580} ERROR - ORA-00933: SQL command not properly ended Traceback (most recent call last): File "/usr/local/lib/python3.6/site-packages/airflow/models/init.py", line 1441, in _run_raw_task result = task_copy.execute(context=context) File "/usr/local/lib/python3.6/site-packages/airflow/operators/oracle_operator.py", line 63, in execute parameters=self.parameters) File "/usr/local/lib/python3.6/site-packages/airflow/hooks/dbapi_hook.py", line 172, in run cur.execute(s) cx_Oracle.DatabaseError: ORA-00933: SQL command not properly ended

Even with single statement ending with ; giving below error :

e.g file

CREATE TABLE TABLE_XYZ AS SELECT 1 Dummy from DUAL;

[2019-06-18 17:47:53,137] {init.py:1580} ERROR - ORA-00922: missing or invalid option Traceback (most recent call last): File "/usr/local/lib/python3.6/site-packages/airflow/models/init.py", line 1441, in _run_raw_task result = task_copy.execute(context=context) File "/usr/local/lib/python3.6/site-packages/airflow/operators/oracle_operator.py", line 63, in execute parameters=self.parameters) File "/usr/local/lib/python3.6/site-packages/airflow/hooks/dbapi_hook.py", line 172, in run cur.execute(s)

with DAG('my_simple_dag',
     default_args=default_args,
     template_searchpath=['/root/rahul/'],
     schedule_interval='*/10 * * * *',
     ) as dag:

opr_oracle = OracleOperator(task_id='oracleTest',oracle_conn_id='STG',
                    sql='test.sql')

do i need to pass any additional parameter to make the dbhook understand that the file need to be split in separate statement ?

as per the documentation it expects param sql: the sql code to be executed. Can receive a str representing a sql statement, a list of str (sql statements), or reference to a template file. Template reference are recognized by str ending in '.sql' (templated)

but the .sql template is not working with multiple statement. any help will be greatly appreciated . Thanks !!

Upvotes: 1

Views: 6644

Answers (2)

DaveH
DaveH

Reputation: 31

The Oracle Operator will take a list of SQL Strings that are templated.

What I have done is read the SQL file in as a text file and then split it on the ';' to create a list of strings.

with open('/home/airflow/airflow/dags/sql/test_multi.sql') as sql_file:
    sql_list = list(filter(None, sql_file.read().split(';')))

t_run_sql = OracleOperator(task_id='run_sql', 
                                    sql=sql_list, 
                                    oracle_conn_id='user_id',
                                    autocommit=True,
                                    depends_on_past=True,
                                    dag=dag)

I tested this with templating (and yes this will fail in Oracle without creating the table first):

drop table test_multi;

create table test_multi as
select
  {{ macros.datetime.strftime(execution_date.in_tz('Australia/Sydney') + macros.timedelta(days=1),'%Y%m%d') }} as day1, 
  {{ macros.datetime.strftime(execution_date.in_tz('Australia/Sydney') + macros.timedelta(days=2),'%Y%m%d') }} as day2,
  {{ macros.datetime.strftime(execution_date.in_tz('Australia/Sydney') + macros.timedelta(days=3),'%Y%m%d') }} as day3
from dual;

insert into test_multi
select
  {{ macros.datetime.strftime(execution_date.in_tz('Australia/Sydney') + macros.timedelta(days=4),'%Y%m%d') }} as day1, 
  {{ macros.datetime.strftime(execution_date.in_tz('Australia/Sydney') + macros.timedelta(days=5),'%Y%m%d') }} as day2,
  {{ macros.datetime.strftime(execution_date.in_tz('Australia/Sydney') + macros.timedelta(days=6),'%Y%m%d') }} as day3
from dual;

This solution has an issue with the need to ensure that your SQL doesn't contain a semi-colon anywhere else. I also think splitting on ';/n' might be better, but it requires that the user always starts a new-line after the ';', so still not ideal.

I also found that I needed to deal with the last semi-colon with the filter(None,...) or the operator would submit an empty command to the database and then error.

Upvotes: 1

Yong Wang
Yong Wang

Reputation: 1313

Instead of sql file, you can assign string format sql statement" Below is original API doc for Oracle operation in airflow. the sql could be str or list of str. If you prefer use file template, you need rendering the file template with parameter. Note: Airflow use jinjia2 as template rending.

oracle_operator API

sql (str or list[str]) – the sql code to be executed. Can receive a str representing a sql statement, a list of str (sql statements), or reference to a template file. Template reference are recognized by str ending in ‘.sql’ (templated)

oracle_conn_id (str) – reference to a specific Oracle database

parameters (mapping or iterable) – (optional) the parameters to render the SQL query with.

autocommit (bool) – if True, each command is automatically committed. (default value: False)

Upvotes: 0

Related Questions