user11638578
user11638578

Reputation: 35

Bind a list in sql query where list is large

Using https://blogs.oracle.com/oraclemagazine/on-cursors-sql-and-analytics, I created a way to bind an array to SQL-statement though it might not be best.

Initial:

cursor.execute("Select * from table where column in (:id)",{"id":('A','B')})

I changed it with the instruction above as:

cursor.execute(with data as (select trim(substr(txt, instr(txt, ',', 1, level  ) + 1,
                                    instr (txt, ',', 1, level+1) - instr (txt, ',', 1, level) -1 ) ) as token
                               from (select ','||:txt||',' txt from dual) 
                                       connect by level <= length(:txt)-length(replace(:txt,',',''))+1)
               Select *
                 from table
                 where column in (select * from data)",{"txt":"A,B"})

Now I have an issue if the list is greater than 1000, the string parameter crosses the 4000-byte mark and can no longer deal with it.

How can I make it work?

[I only have "select" privilege and cannot create temporary tables]

Upvotes: 0

Views: 1795

Answers (1)

robertus
robertus

Reputation: 356

The whole idea of passing a large list of ids as a parameter is a wrong approach. If you'd like to do it for "educational" purposes then OK - do it. For production environment, it's a blind alley. I can tell from my experience it could cause serious problems. You need enormous size of memory just for parsing the query. I've seen Django application where the developer queried Oracle DB for some ids and then sent another query for other data based on those ids, and those ids were passed as list - it was query like yours:

select a,b,c
  from table
  where table.d in (<here was the list of all ids>)
;

The problem was that in some cases that list was really large (about 400KB of data, I've seen trace with about 57000 ids placed in the query). The size of whole query was 433KB! It was insane! The most ridiculous was the fact that parsing took much longer (about 15 minutes!) than executing query (few seconds).

In my opinion passing huge query (in size) is terrible idea and you should redesign your code.

What you can do:

  1. create a temporary table or just a table (for the latter you have to take care of multiple sessions) and place there ids needed for query (ask for grants or for creating a table if you cannot do it yourself);
  2. split your task into parts (maybe you can ask for a part of ids, then for another part and at the end join the results at the application level?);
  3. maybe your "ids" are result of another query - then place that query in where clause.

It's hard to tell not knowing anything more about end goal.

If you are doing it for just educational purposes, then (for your current query) try to use CLOB type (or in cx_oracle an object representing CLOB) as storage for string containing ids.

EDIT (2019-07-09)

Example (just simple draft based on samples from cx_oracle):

#!/usr/bin/env python
# -*- coding: utf-8 -*-

import cx_Oracle

db_login = 'login'
db_target = 'tns_alias_or_host_port_service_name'
db_pass = 'pass'

con = cx_Oracle.connect('{0}/{1}@{2}'.format(db_login, db_pass, db_target))

sql = '''select to_char(regexp_substr(:p_clob, '[^,]+', 1, level)) as token
 from dual
 connect by regexp_substr(:p_clob, '[^,]+', 1, level) is not null
'''


large_string = ",".join(str(i) for i in range(11000, 12000, 1))
print('Size of large_string: {}'.format(len(large_string)))
print('Number of elements in large_string: {}'.format(large_string.count(',') + 1))

cur = con.cursor()
cur.arraysize = 256
cur.setinputsizes(p_clob = cx_Oracle.CLOB)
cur.prepare(sql)
cur.execute(None, {'p_clob': large_string })

# fetch first part of rows (see: cur.arraysize)
result = cur.fetchmany()

print(result)

Upvotes: 2

Related Questions