Reputation: 61
Im having trouble with a postgresql query using SQLAlchemy.
I created some large tables using this line of code:
frame.to_sql('Table1', con=engine, method='multi', if_exists='append')
It worked fine. Now, when I want to query data out of it, my first problem is that I have to use quotation marks for each table and column name and I dont really know why, maybe somebody can help me out there.
That is not my main problem though. My main problem is, that when querying the data, all numerical WHERE conditions work fine, but not the ones with Strings in the column data. I get an error that the column does not exist. Im using:
df = pd.read_sql_query('SELECT "variable1", "variable2" FROM "Table1" WHERE "variable1" = 123 AND "variable2" = "abc" ', engine)
I think it might be a problem that I use "abc" instead of 'abc', but I cant change it because of the ' signs in the argument of the query. If I change those ' to " then the Column names and Table names are not detected correctly (because of the problem before that they have to be in quotation marks).
This is the error message:
ProgrammingError: (psycopg2.errors.UndefinedColumn) ERROR: COLUMN »abc« does not exist
LINE 1: ...er" FROM "Table1" WHERE "variable2" = "abc"
And there is an arrow pointing to the first quotation mark of the "abc".
Im new to SQL and I would really appreciate if someone could point me in the right direction.
Upvotes: 2
Views: 3478
Reputation: 123739
"Most" SQL dialects (notable exceptions being MS SQL Server and MS Access) strictly differentiate between
WHERE thing = 'foo'
WHERE "some col" = 123
PostgreSQL throws in the added wrinkle that table/column names are forced to lower case if they are not (double-)quoted and then uses case-sensitive matching, so if your table is named Table1
then
SELECT * FROM Table1
will fail because PostgreSQL will look for table1
, butSELECT * FROM "Table1"
will succeed.The way to avoid confusion in your query is to use query parameters instead of string literals:
# set up test environment
with engine.begin() as conn:
conn.exec_driver_sql('DROP TABLE IF EXISTS "Table1"')
conn.exec_driver_sql('CREATE TABLE "Table1" (variable1 int, variable2 varchar(50))')
df1 = pd.DataFrame([(123, "abc"), (456, "def")], columns=["variable1", "variable2"])
df1.to_sql("Table1", engine, index=False, if_exists="append")
# test .read_sql_query() with parameters
import sqlalchemy as sa
sql = sa.text('SELECT * FROM "Table1" WHERE variable1 = :v1 AND variable2 = :v2')
param_dict = {"v1": 123, "v2": "abc"}
df2 = pd.read_sql_query(sql, engine, params=param_dict)
print(df2)
"""
variable1 variable2
0 123 abc
"""
Upvotes: 3
Reputation: 19643
It should be: AND "variable2" = 'abc'
.
You cannot quote strings/literals with "
, as PostgreSQL will interpret it as a database object. Btw. you do not need to wrap table names and and columns with double quotes unless it is extremely necessary, e.g. case sensitive object names, names containing spaces, etc. Imho it is a bad practice and on the long run only leads to confusion. So your query could be perfectly written as follows:
SELECT variable1, variable2
FROM table1
WHERE variable1 = 123 AND variable2 = 'abc';
Keep in mind that it also applies for other objects, like tables or indexes.
CREATE TABLE Table1 (id int)
- nice.CREATE TABLE "Table1" (id int)
- not nice.CREATE TABLE "Table1" ("id" int)
- definitely not nice ;)In case you want to remove the unnecessary double quotes from your table name:
ALTER TABLE "Table1" RENAME TO table1;
Demo: db<>fiddle
Upvotes: 1