Ojingo
Ojingo

Reputation: 232

Pandas read_sql with where clause using "in"

I need to query a table with an "in" clause, where the SQL looks like this:

select * from some_table where some_field in (?)

I originally took a naive approach and tried this:

in_items = [1,2,3,4]
df = pd.read_sql(MY_SQL_STATEMENT, con=con, params=[in_items]

Which did not work, it throws the following error:

The SQL contains 1 parameter markers, but 4 parameters were supplied

Where I'm stuck at, is figuring out how to pass a list of items as a single parameter.

I can do a string concatenation approach, something like:

MY_SQL = 'select * from tableA where fieldA in ({})'.format(
  ','.join([str(x) from x in list_items]))
df = pd.read_sql(MY_SQL, con=con)

I would rather avoid this approach if possible. Does anybody know of a way to pass a list of values as a single parameter?

I'm also open to a possibly more cleverer way to do this. :)

Upvotes: 6

Views: 11839

Answers (3)

Gord Thompson
Gord Thompson

Reputation: 123629

Now that pandas is more insistent on using a SQLAlchemy Connectable you may want to consider using a SQLAlchemy Core query:

import pandas as pd
import sqlalchemy as sa

engine = sa.create_engine("mssql+pyodbc://scott:tiger^5HHH@mssql_199")

# set up example
with engine.begin() as conn:
    conn.exec_driver_sql("DROP TABLE IF EXISTS some_table")
    conn.exec_driver_sql("CREATE TABLE some_table (some_field int)")
    conn.exec_driver_sql(
        "INSERT INTO some_table (some_field) VALUES (1), (2), (3)"
    )

some_table = sa.Table("some_table", sa.MetaData(), autoload_with=engine)

in_items = [1, 3]
qry = sa.select(some_table).where(some_table.c.some_field.in_(in_items))

engine.echo = True
df = pd.read_sql_query(qry, engine)
"""SQL emitted:
SELECT some_table.some_field 
FROM some_table 
WHERE some_table.some_field IN (?, ?)
[generated in 0.00065s] (1, 3)
"""

Upvotes: 0

For me, using sqllite3, worked this way:

list_of_entries_to_retrive = pd.read_excel('.table_with_entries.xlsx')
list_of_entries_to_retrive = (cell_list['entries']).tolist()

conn = sqlite3.connect('DataBase.db')

queryString = 'SELECT * FROM table WHERE attribute IN (\'{}\');'.format('\',\''.join([_ for _ in list_of_entries_to_retrive]))
df = pd.read_sql(queryString, con=conn)

Do not worked this way:

df = pd.read_sql(queryString, con=conn,  params=[list_of_entries_to_retrive]))

Thanks

Upvotes: -1

Parfait
Parfait

Reputation: 107652

Simply string format the placeholders then pass in your params into pandas.read_sql. Do note, placeholder markers depend on DB-API: pyodbc/sqlite3 uses qmarks ? and most others use %s. Below assumes the former marker:

in_items = [1,2,3,4]
MY_SQL = 'select * from tableA where fieldA in ({})'\
           .format(', '.join(['?' for _ in in_items]))
# select * from tableA where fieldA in (?, ?, ?, ?)

df = pd.read_sql(MY_SQL, con=con, params=[in_items])

Upvotes: 8

Related Questions