ScottP
ScottP

Reputation: 53

Searching IN list in python + sqlite3

I am running python with the sqlite3 module and would like to search for rows that match a list. The sql should look like

SELECT column FROM table WHERE column IN (1, 2, 3, 4, 5)

In python, we are recommended to use the ? parameter for value substitution, so this should look like

val_num = [ 1, 2, 3, 4, 5 ]
val_str = ', '.join( [ str(v) for v in val_num ] )

db = sqlite3.connect( filename )   
sql = '''SELECT column FROM table WHERE column IN (?)'''
cur = db.cursor()
cur.execute( sql, (val_str,) )
print( cur.fetch_all() )
cur.close()

This returns an empty list, []

However, if I manually substitute values into the sql statement, which is not advised, it works as expected

val_num = [ 1, 2, 3, 4, 5 ]
val_str = ', '.join( [ str(int(v)) for v in val_num ] )

db = sqlite3.connect( filename )   
sql = '''SELECT column FROM table WHERE column IN ({})'''.format( val_str )
cur = db.cursor()
cur.execute( sql, (val_str,) )
print( cur.fetch_all() )
cur.close()

Which returns [(1,), (2,), (3,), (4,), (5,)]

How do I execute this statement using the API syntax instead of manually substituting values?

Upvotes: 4

Views: 4136

Answers (4)

mikeliux
mikeliux

Reputation: 161

Summary: sqlite3 does not fetch data from a list directly, but you can use python to prepare the query string. There are several ways to do this. (1) The easiest way would be a direct replacement using f-strings, but then the application would become vulnerable to SQL injection attacks. (2) Using question marks as placeholders, or (3) using named placeholders, are the safe ways to pass queries into sqlite3, but it requires a bit more work. (4) Also, you could create a new SQL table with the list (or lists) you have, and run regular SQL queries on it (the latter is not shown here).

Example using two lists

Let me explain with a toy example. Consider the following SQL table called preferences, which contains the names of some famous mathematicians.

id color number name
1 BLUE 3 NEWTON
2 RED 2 EULER
3 YELLOW 2 GAUSS
4 YELLOW 2 RAMANUJAN
5 YELLOW 3 EUCLID

Suppose you want to retrieve names of the mathematicians whose favorite color is either blue or yellow, and whose favorite number is either 2 or 1.

Manually, the SQLITE query would be something like this:

SELECT id, name FROM preferences
WHERE number IN (1, 2)
AND color IN ('YELLOW', 'BLUE')

Let us now see different methods to do this.

0. Using placeholders directly won't work

The naïve approach of passing the lists directly as placeholders won't work. In the code below, it raises an exception.

1. Solution using f-strings only (insecure)

Using string formatting, you can pass values as tuples and replace the query directly. This is not recommended if an external user can interact with the database, because the database becomes vulnerable to SQL injection attacks.

2. Solution using f-strings to crate questionmark (?) placeholders (secure) This method is quite simple to implement. The drawback is that the query string ends up with many question marks, which makes it difficult to be read by a person.

3. Solution using named placeholders and f-strings (secure) The idea here is to introduce a function that creates automatic placeholders from a list. Suppose the list is L=[x0, x1, x2]. Then we would like to name the placeholders as :x_0, :x_1 and :x_2,

The function should also return the corresponding dictionary of values, such as {"x_0": x0, "x_1": x1, "x_2": x2}.

Using this function, you can now prepare your query string to use named placeholders

python 3.12 code

import sqlite3
from sqlite3 import OperationalError
def run_sqlite_query(query: str, values: dict | None =None, verbose: bool = True) -> tuple | None:
    try:
        cnn = sqlite3.connect("my_toy_database.db")
        cur = cnn.cursor()
        if verbose:
            print("\nQuery:", query)
            print("Values:\n", values)
        if values:
            cur.execute(query, values)
        else:
            cur.execute(query)
        records = cur.fetchall()

        if verbose:
            print("\nRecords:\n", records)

        return records
    except OperationalError as e:
        print("\nOperationalError raised.\n")
    finally:
        cnn.close()

def get_placeholders(L: list[any], basename: str = "x") -> tuple[str, dict]:
    """Helper function to prepare named placedholers and value dict for method 3"""
    placeholders = ", ".join([f":{basename}_{i}" for i, x in enumerate(L)])
    values = {f"{basename}_{i}": x for i, x in enumerate(L)}
    return f"({placeholders})", values

def invalid_method_direct_placeholders():
    # This query won't work with lists as expected:
    query = """
        SELECT id, name FROM preferences
        WHERE number IN :desired_numbers
        AND color IN :desired_colors
        """
    values = {"desired_numbers": tuple(desired_numbers), "desired_colors": tuple(desired_colors)}
    records = run_sqlite_query(query, values)
    return records

def method1_fstrings_only__insecure():
    # This query works but is vulnerable to SQL injection attacks
    query = f"""
        SELECT id FROM preferences
        WHERE number IN {tuple(desired_numbers)}
        AND color IN {tuple(desired_colors)}
        """

    records = run_sqlite_query(query)
    return records


def method2_fstrings_and_question_mark_placeholders():
    query = f"""
        SELECT id, name FROM preferences
        WHERE number IN ({','.join(['?']*len(desired_numbers))})
        AND color IN ({','.join(['?']*len(desired_colors))})
        """
    values = desired_numbers + desired_colors
    records = run_sqlite_query(query, values)
    return records

def method3_fstrings_plus_named_placeholders():

    # Prepare placeholder for each list
    num_placeholders, num_values = get_placeholders(desired_numbers, "number")
    color_placeholders, color_values = get_placeholders(desired_colors, "color")

    print(f"{num_placeholders = }")
    print(f"{num_values = }")
    print(f"{color_placeholders = }")
    print(f"{color_values = }")

    # Merge into a single ditionary with values
    values = {**color_values, **num_values}

    # Using f-strings to generate a query string:
    query = f"""
        SELECT id, name FROM preferences
        WHERE number IN {num_placeholders}
        AND color IN {color_placeholders}
    """

    records = run_sqlite_query(query, values)
    return records


if __name__ == "__main__":
    # Initial lists
    desired_numbers = [2, 1]
    desired_colors = ['YELLOW', 'BLUE']

    sep = "\n" + "*"*30 + "\n"

    print(sep, "0. Records from invalid method:", sep)
    records = invalid_method_direct_placeholders()

    print(sep, "1. Records from method 1:", sep)
    records = method1_fstrings_only__insecure()

    print(sep, "2. Records from method 2:", sep)
    records = method2_fstrings_and_question_mark_placeholders()

    print(sep, "3. Records from method 3:", sep)
    records = method3_fstrings_plus_named_placeholders()

Here are the results of running the given script:

******************************
 0. Records from invalid method: 
******************************


Query: 
        SELECT id, name FROM preferences
        WHERE number IN :desired_numbers
        AND color IN :desired_colors
        
Values:
 {'desired_numbers': (2, 1), 'desired_colors': ('YELLOW', 'BLUE')}

OperationalError raised.


******************************
 1. Records from method 1: 
******************************


Query: 
        SELECT id, name FROM preferences
        WHERE number IN (2, 1)
        AND color IN ('YELLOW', 'BLUE')
        
Values:
 None

Records:
 [(3, 'GAUSS'), (4, 'RAMANUJAN')]

******************************
 2. Records from method 2: 
******************************


Query: 
        SELECT id, name FROM preferences
        WHERE number IN (?,?)
        AND color IN (?,?)
        
Values:
 [2, 1, 'YELLOW', 'BLUE']

Records:
 [(3, 'GAUSS'), (4, 'RAMANUJAN')]

******************************
 3. Records from method 3: 
******************************

num_placeholders = '(:number_0, :number_1)'
num_values = {'number_0': 2, 'number_1': 1}
color_placeholders = '(:color_0, :color_1)'
color_values = {'color_0': 'YELLOW', 'color_1': 'BLUE'}

Query: 
        SELECT id, name FROM preferences
        WHERE number IN (:number_0, :number_1)
        AND color IN (:color_0, :color_1)
    
Values:
 {'color_0': 'YELLOW', 'color_1': 'BLUE', 'number_0': 2, 'number_1': 1}

Records:
 [(3, 'GAUSS'), (4, 'RAMANUJAN')]

Upvotes: 2

gold_cy
gold_cy

Reputation: 14216

If the goal is to use server-side parameter substitution you will need to build the query to have the exact amount of ? that you have variables you want to check. The following should achieve that.

val_num = [1, 2, 3, 4, 5]
qs = ", ".join("?" * len(val_num))

query = f"SELECT column FROM table WHERE column IN {qs}"
cur.execute(sql, val_str)

Upvotes: 2

GMB
GMB

Reputation: 222432

The placeholder mechansim is here to pass single literal values, not lists. So this:

SELECT column FROM table WHERE column IN (?)

... generates a query where all values are stuffed together in the same literal string, like:

SELECT column FROM table WHERE column IN ('1, 2, 3, 4, 5')

The where predicate is equivalent to: column = '1, 2, 3, 4, 5', which obviously is not what you want.

You need have one placeholder (?) per value in the list. Here is one way to do it:

sql = 'SELECT column FROM table WHERE column IN ({0})'.format(', '.join('?' for _ in val_num));
cur.execute(sql, val_num);

Upvotes: 2

Sowjanya R Bhat
Sowjanya R Bhat

Reputation: 1168

Try to use this approach to make the IN operation :

val_num = [ 1, 2, 3, 4, 5 ]

query_str = ''' 
SELECT column FROM table
where
column in  (
''' + ','.join('%s' for i in range(len(val_num))) + ' ) '

cursor.execute(query_str, params=tuple(val_num))
rows = cursor.fetchall()

This is the pseudo-code , not given the statements to make cursor objects n all. just try to change your query-string & the execute line seeing this example - you will get it working.

Upvotes: 0

Related Questions