Reputation: 53
The SQLAlchemy .filter() function doesn't seem to be able to work with numpy datatypes. If I use a np.int32 in the filter argument the desired result is not achieved. Instead, I need to cast my np.int32 to int to make a query work as expected.
In the following example I query values from the database, do some selection via numpy and query again with the reduced selection (I'm aware that this is not a meaningful operation, it's just to demonstrate the issue)
import numpy as np
from sqlalchemy import Column, Integer, Float, ForeignKey
from sqlalchemy.ext.declarative import declarative_base
# ORM
Base = declarative_base()
class Example(Base):
__tablename__ = 'Example'
UID = Column(Integer, primary_key=True, autoincrement=True)
Value = Column(Float)
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
engine = create_engine('sqlite:///example.db', echo=False)
Session = sessionmaker(bind=engine)
Base.metadata.drop_all(engine)
Base.metadata.create_all(engine)
value = [1.1, 1.2, 2.1, 2.2]
session = Session()
for vi in value:
example = Example(Value=vi)
session.add(example)
session.commit()
res = session.query(Example.UID, Example.Value).all()
uids = []
vals = []
for resi in res:
uids.append(resi[0])
vals.append(resi[1])
uids = np.array(uids) # creates a numpy array with elements of type np.int32
vals = np.array(vals)
idxsel = vals > 2
uids_sel = uids[idxsel]
for uidi in uids_sel:
res = session.query(Example.Value).filter(Example.UID==uidi).all() # returns empty list
print('without cast: {}'.format(res))
res = session.query(Example.Value).filter(Example.UID==int(uidi)).all()
print('with cast: {}'.format(res))
The result being
without cast: []
with cast: [(2.1,)]
without cast: []
with cast: [(2.2,)]
I'm not sure, why the numpy datatype doesn't work. I never had any compatibility issues with numpy before. For me this is a rather serious issue. Of course you can cast, but since there is no warning or error this is really prone to serious trouble (accidentally missing cast) as an empty list is of course a valid response if your query doesn't match...
Since both numpy and sqlalchemy are rather widespread it would be great if they were compatible...
I really wonder that I found no mention of this problem anywhere via google. So maybe I'm missing something or doing something wrong. Any help is appreciated.
EDIT: The issue is with the sqlite3 API not with sqlalchemy. With the help of sqlalchemy we can however use TypeDecorators to work around the issue: https://github.com/sqlalchemy/sqlalchemy/issues/3586
EDIT: And here's a stackoverflow on how to deal with it without using sqlalchemy inserting numpy integer types into sqlite with python3
Upvotes: 1
Views: 1832
Reputation: 1676
I spent a few hours running into the same underlying issue from a different angle (the where()
clause). Although the above question was eventually edited with a mention of the TypeDecorator
class as a solution, I wanted to implement that solution here (using sqlalchemy's Core API instead of ORM). Hopefully the expanded verbiage on the page improves the search result for this issue.
from pathlib import Path
import pandas as pd
from sqlalchemy import (
create_engine,
MetaData,
Table,
Column,
Integer,
String,
select,
insert,
)
eng = create_engine("sqlite:///"+str(Path().resolve() / 'temp.db'))
md = MetaData()
tbl = Table("users", md, Column("id", Integer), Column("name", String))
md.create_all(eng)
eng.execute(insert(tbl, (1,"Jack")))
stmt1 = select(tbl).where(tbl.c.id == 1)
print(eng.execute(stmt1).fetchall())
Returns as expected:
[(1, 'Jack')]
However:
target_id = pd.read_sql(stmt1, eng).loc[:,'id'].max()
stmt2 = select(tbl).where(tbl.c.id == target_id)
print(eng.execute(stmt2).fetchall())
returns:
[]
The selection fails to behave as expected because pandas.read_sql
stores data as numpy types.
TypeDecorator
solutionWe can fix this with TypeDecorator
import sqlalchemy.types as types
class IntegerLike(types.TypeDecorator):
impl = types.Integer
cache_ok = True
def process_bind_param(self, value, dialect):
return int(value)
If we delete the temp.db
we just created above and run the code with :
tbl = Table("users", md, Column("id", IntegerLike), Column("name", String))
We can tell a little bit of what is going wrong by
print(stmt1)
SELECT users.id, users.name
FROM users
WHERE users.id = :id_1
The :id_1
is a bind parameter, and these are handled by the underlying DBAPI. (I don't know how to actually force the DBAPI to render the statement fully, so I'd welcome any comments or edits that provide a way)
Upvotes: 1