Larytet
Larytet

Reputation: 646

Filter by results of select

I am trying to translate the following query to peewee:

select count(*) from A where
id not in (select distinct package_id FROM B) 

What is the correct Python code? So far I have this:

A.select(A.id).where(A.id.not_in(B.select(B.package_id).distinct()).count()

This code is not returning the same result. A and B are large 10-20M rows each. I can't create a dictionary of existing package_id items in the memory.

For example, this takes lot of time:

A.select(A.id).where(A.id.not_in({x.package_id for x in B.select(B.package_id).distinct()}).count()

May be LEFT JOIN?

Update: I ended up calling database.execute_sql()

Upvotes: 0

Views: 149

Answers (1)

coleifer
coleifer

Reputation: 26245

Your SQL:

select count(*) from A where
id not in (select distinct package_id FROM B) 

Equivalent peewee:

q = (A
     .select(fn.COUNT(A.id))
     .where(A.id.not_in(B.select(B.package_id.distinct()))))
count = q.scalar()

Upvotes: 1

Related Questions