adarsh
adarsh

Reputation: 1503

Is executing query in loop vs in one go at Oracle recommended for substantial performance gain?

I've a Python batch job that executes a SQL query based on supplied ids.
The number of ids is usually above 20k.

The query is:

SELECT MY_ID, COL_A, COL_B FROM SOME_TABLE WHERE ID IN {dynamically_set_ids}

Because Oracle limits the number of values in IN clause to 1000, I updated the query to:

WITH ids_list AS 
(select id1 AS my_id FROM dual
UNION
select id2 AS my_id FROM dual ...) -- all 20k+ ids added like this
SELECT col_a,
       col_b,
       some_table.my_id

FROM   some_table
join   ids_list
ON     ids_list.my_id = some_table.my_id

The above query executes in around 33 mins.
But when I loop the list of ids and execute the query in a loop:

con = get_con()
for i in range(calculated_iterations):
    con.execute("select my_id, col_a, col_b where my_id in {0}".format(1k_or_less_ids))
    # collect data

For current amount of data, the loop above returns expected data in around 40 secs only.

So, the question is: Is it suggested/ok to execute the select in a loop ?

The performance gain doesn't matter much as the job runs overnight. But, for such a gain, would a loop be preferred ?

Note that amount of data cannot be guessed. It usually runs in millions of rows.

Upvotes: 0

Views: 45

Answers (1)

EdStevens
EdStevens

Reputation: 3872

20K 'union select all ..' statements? Regardless of run-time performance, that is coding madness!

The 'textbook' solution is to put those 20k values in a text file, define an external table over it, then

SELECT MY_ID, 
       COL_A, 
       COL_B 
FROM SOME_TABLE WHERE ID IN (select ext_id 
                             from my_external_table)
;

Upvotes: 2

Related Questions