Reputation: 53
I am running python with the sqlite3 module and would like to search for rows that match a list. The sql should look like
SELECT column FROM table WHERE column IN (1, 2, 3, 4, 5)
In python, we are recommended to use the ?
parameter for value substitution, so this should look like
val_num = [ 1, 2, 3, 4, 5 ]
val_str = ', '.join( [ str(v) for v in val_num ] )
db = sqlite3.connect( filename )
sql = '''SELECT column FROM table WHERE column IN (?)'''
cur = db.cursor()
cur.execute( sql, (val_str,) )
print( cur.fetch_all() )
cur.close()
This returns an empty list, []
However, if I manually substitute values into the sql statement, which is not advised, it works as expected
val_num = [ 1, 2, 3, 4, 5 ]
val_str = ', '.join( [ str(int(v)) for v in val_num ] )
db = sqlite3.connect( filename )
sql = '''SELECT column FROM table WHERE column IN ({})'''.format( val_str )
cur = db.cursor()
cur.execute( sql, (val_str,) )
print( cur.fetch_all() )
cur.close()
Which returns [(1,), (2,), (3,), (4,), (5,)]
How do I execute this statement using the API syntax instead of manually substituting values?
Upvotes: 4
Views: 4136
Reputation: 161
Summary:
sqlite3
does not fetch data from a list directly, but you can use python to prepare the query string. There are several ways to do this. (1) The easiest way would be a direct replacement using f-strings, but then the application would become vulnerable to SQL injection attacks. (2) Using question marks as placeholders, or (3) using named placeholders, are the safe ways to pass queries into sqlite3, but it requires a bit more work. (4) Also, you could create a new SQL table with the list (or lists) you have, and run regular SQL queries on it (the latter is not shown here).
Example using two lists
Let me explain with a toy example. Consider the following SQL table called preferences
, which contains the names of some famous mathematicians.
id | color | number | name |
---|---|---|---|
1 | BLUE | 3 | NEWTON |
2 | RED | 2 | EULER |
3 | YELLOW | 2 | GAUSS |
4 | YELLOW | 2 | RAMANUJAN |
5 | YELLOW | 3 | EUCLID |
Suppose you want to retrieve names of the mathematicians whose favorite color is either blue or yellow, and whose favorite number is either 2 or 1.
Manually, the SQLITE query would be something like this:
SELECT id, name FROM preferences
WHERE number IN (1, 2)
AND color IN ('YELLOW', 'BLUE')
Let us now see different methods to do this.
0. Using placeholders directly won't work
The naïve approach of passing the lists directly as placeholders won't work. In the code below, it raises an exception.
1. Solution using f-strings only (insecure)
Using string formatting, you can pass values as tuples and replace the query directly. This is not recommended if an external user can interact with the database, because the database becomes vulnerable to SQL injection attacks.
2. Solution using f-strings to crate questionmark (?
) placeholders (secure)
This method is quite simple to implement. The drawback is that the query string ends up with many question marks, which makes it difficult to be read by a person.
3. Solution using named placeholders and f-strings (secure)
The idea here is to introduce a function that creates automatic placeholders from a list. Suppose the list is L=[x0, x1, x2]
. Then we would like to name the placeholders as :x_0
, :x_1
and :x_2
,
The function should also return the corresponding dictionary of values
, such as {"x_0": x0, "x_1": x1, "x_2": x2}
.
Using this function, you can now prepare your query string to use named placeholders
python 3.12 code
import sqlite3
from sqlite3 import OperationalError
def run_sqlite_query(query: str, values: dict | None =None, verbose: bool = True) -> tuple | None:
try:
cnn = sqlite3.connect("my_toy_database.db")
cur = cnn.cursor()
if verbose:
print("\nQuery:", query)
print("Values:\n", values)
if values:
cur.execute(query, values)
else:
cur.execute(query)
records = cur.fetchall()
if verbose:
print("\nRecords:\n", records)
return records
except OperationalError as e:
print("\nOperationalError raised.\n")
finally:
cnn.close()
def get_placeholders(L: list[any], basename: str = "x") -> tuple[str, dict]:
"""Helper function to prepare named placedholers and value dict for method 3"""
placeholders = ", ".join([f":{basename}_{i}" for i, x in enumerate(L)])
values = {f"{basename}_{i}": x for i, x in enumerate(L)}
return f"({placeholders})", values
def invalid_method_direct_placeholders():
# This query won't work with lists as expected:
query = """
SELECT id, name FROM preferences
WHERE number IN :desired_numbers
AND color IN :desired_colors
"""
values = {"desired_numbers": tuple(desired_numbers), "desired_colors": tuple(desired_colors)}
records = run_sqlite_query(query, values)
return records
def method1_fstrings_only__insecure():
# This query works but is vulnerable to SQL injection attacks
query = f"""
SELECT id FROM preferences
WHERE number IN {tuple(desired_numbers)}
AND color IN {tuple(desired_colors)}
"""
records = run_sqlite_query(query)
return records
def method2_fstrings_and_question_mark_placeholders():
query = f"""
SELECT id, name FROM preferences
WHERE number IN ({','.join(['?']*len(desired_numbers))})
AND color IN ({','.join(['?']*len(desired_colors))})
"""
values = desired_numbers + desired_colors
records = run_sqlite_query(query, values)
return records
def method3_fstrings_plus_named_placeholders():
# Prepare placeholder for each list
num_placeholders, num_values = get_placeholders(desired_numbers, "number")
color_placeholders, color_values = get_placeholders(desired_colors, "color")
print(f"{num_placeholders = }")
print(f"{num_values = }")
print(f"{color_placeholders = }")
print(f"{color_values = }")
# Merge into a single ditionary with values
values = {**color_values, **num_values}
# Using f-strings to generate a query string:
query = f"""
SELECT id, name FROM preferences
WHERE number IN {num_placeholders}
AND color IN {color_placeholders}
"""
records = run_sqlite_query(query, values)
return records
if __name__ == "__main__":
# Initial lists
desired_numbers = [2, 1]
desired_colors = ['YELLOW', 'BLUE']
sep = "\n" + "*"*30 + "\n"
print(sep, "0. Records from invalid method:", sep)
records = invalid_method_direct_placeholders()
print(sep, "1. Records from method 1:", sep)
records = method1_fstrings_only__insecure()
print(sep, "2. Records from method 2:", sep)
records = method2_fstrings_and_question_mark_placeholders()
print(sep, "3. Records from method 3:", sep)
records = method3_fstrings_plus_named_placeholders()
Here are the results of running the given script:
******************************
0. Records from invalid method:
******************************
Query:
SELECT id, name FROM preferences
WHERE number IN :desired_numbers
AND color IN :desired_colors
Values:
{'desired_numbers': (2, 1), 'desired_colors': ('YELLOW', 'BLUE')}
OperationalError raised.
******************************
1. Records from method 1:
******************************
Query:
SELECT id, name FROM preferences
WHERE number IN (2, 1)
AND color IN ('YELLOW', 'BLUE')
Values:
None
Records:
[(3, 'GAUSS'), (4, 'RAMANUJAN')]
******************************
2. Records from method 2:
******************************
Query:
SELECT id, name FROM preferences
WHERE number IN (?,?)
AND color IN (?,?)
Values:
[2, 1, 'YELLOW', 'BLUE']
Records:
[(3, 'GAUSS'), (4, 'RAMANUJAN')]
******************************
3. Records from method 3:
******************************
num_placeholders = '(:number_0, :number_1)'
num_values = {'number_0': 2, 'number_1': 1}
color_placeholders = '(:color_0, :color_1)'
color_values = {'color_0': 'YELLOW', 'color_1': 'BLUE'}
Query:
SELECT id, name FROM preferences
WHERE number IN (:number_0, :number_1)
AND color IN (:color_0, :color_1)
Values:
{'color_0': 'YELLOW', 'color_1': 'BLUE', 'number_0': 2, 'number_1': 1}
Records:
[(3, 'GAUSS'), (4, 'RAMANUJAN')]
Upvotes: 2
Reputation: 14216
If the goal is to use server-side parameter substitution you will need to build the query to have the exact amount of ?
that you have variables you want to check. The following should achieve that.
val_num = [1, 2, 3, 4, 5]
qs = ", ".join("?" * len(val_num))
query = f"SELECT column FROM table WHERE column IN {qs}"
cur.execute(sql, val_str)
Upvotes: 2
Reputation: 222432
The placeholder mechansim is here to pass single literal values, not lists. So this:
SELECT column FROM table WHERE column IN (?)
... generates a query where all values are stuffed together in the same literal string, like:
SELECT column FROM table WHERE column IN ('1, 2, 3, 4, 5')
The where
predicate is equivalent to: column = '1, 2, 3, 4, 5'
, which obviously is not what you want.
You need have one placeholder (?
) per value in the list. Here is one way to do it:
sql = 'SELECT column FROM table WHERE column IN ({0})'.format(', '.join('?' for _ in val_num));
cur.execute(sql, val_num);
Upvotes: 2
Reputation: 1168
Try to use this approach to make the IN
operation :
val_num = [ 1, 2, 3, 4, 5 ]
query_str = '''
SELECT column FROM table
where
column in (
''' + ','.join('%s' for i in range(len(val_num))) + ' ) '
cursor.execute(query_str, params=tuple(val_num))
rows = cursor.fetchall()
This is the pseudo-code , not given the statements to make cursor
objects n all. just try to change your query-string
& the execute
line seeing this example - you will get it working.
Upvotes: 0