Reputation: 29934
I have 2 tables; we'll call them table1
and table2
. table2
has a foreign key to table1
. I need to delete the rows in table1
that have zero child records in table2
. The SQL to do this is pretty straightforward:
DELETE FROM table1
WHERE 0 = (SELECT COUNT(*) FROM table2 WHERE table2.table1_id = table1.table1_id);
However, I haven't been able to find a way to translate this query to SQLAlchemy. Trying the straightforward approach:
subquery = session.query(sqlfunc.count(Table2).label('t2_count')).select_from(Table2).filter(Table2.table1_id == Table1.table1_id).subquery()
session.query(Table1).filter(0 == subquery.columns.t2_count).delete()
Just yielded an error:
sqlalchemy.exc.ArgumentError: Only deletion via a single table query is currently supported
How can I perform this DELETE
with SQLAlchemy?
Upvotes: 3
Views: 5549
Reputation: 9079
I'm pretty sure this is what you want. You should try it out though. It uses EXISTS.
from sqlalchemy.sql import not_
# This fetches rows in python to determine which ones were removed.
Session.query(Table1).filter(not_(Table1.table2s.any())).delete(
synchronize_session='fetch')
# If you will not be referencing more Table1 objects in this session then you
# can just ignore syncing the session.
Session.query(Table1).filter(not_(Table1.table2s.any())).delete(
synchronize_session=False)
Explanation of the argument for delete():
http://docs.sqlalchemy.org/en/rel_0_8/orm/query.html#sqlalchemy.orm.query.Query.delete
Example with exists(using any() above uses EXISTS):
http://docs.sqlalchemy.org/en/rel_0_8/orm/tutorial.html#using-exists
Here is the SQL that should be generated:
DELETE FROM table1 WHERE NOT (EXISTS (SELECT 1
FROM table2
WHERE table1.id = table2.table1_id))
If you are using declarative I think there is a way to access Table2.table and then you could just use the sql layer of sqlalchemy to do exactly what you want. Although you run into the same issue of making your Session out of sync.
Upvotes: 3
Reputation: 29934
Well, I found one very ugly way to do it. You can do a select with a join to get the rows loaded into memory, then you can delete them individually:
subquery = session.query(Table2.table1_id
,sqlalchemy.func.count(Table2.table2_id).label('t1count')
) \
.select_from(Table2) \
.group_by(Table2.table1_id) \
.subquery()
rows = session.query(Table1) \
.select_from(Table1) \
.outerjoin(subquery, Table1.table1_id == subquery.c.table1_id) \
.filter(subquery.c.t1count == None) \
.all()
for r in rows:
session.delete(r)
This is not only nasty to write, it's also pretty nasty performance-wise. For starters, you have to bring the table1
rows into memory. Second, if you were like me and had a line like this on Table2
's class definition:
table1 = orm.relationship(Table1, backref=orm.backref('table2s'))
then SQLAlchemy will actually perform a query to pull the related table2
rows into memory, too (even though there aren't any). Even worse, because you have to loop over the list (I tried just passing in the list; didn't work), it does so one table1
row at a time. So if you're deleting 10 rows, it's 21 individual queries (1 for the initial select, 1 for each relationship pull, and 1 for each delete). Maybe there are ways to mitigate that; I would have to go through the documentation to see. All this for things I don't even want in my database, much less in memory.
I won't mark this as the answer. I want a cleaner, more efficient way of doing this, but this is all I have for now.
Upvotes: 1