Reputation: 608
Having a table
with columns listing_ids
, keys
how to get listing_ids
(and corresponding missing keys) where values from list_of_values
do not exist in keys
?
list_of_values = [key2,key3]
table
listing_id keys
424 key1
424 key2
424 key3
523 key1
2433 key2
2433 key1
53 key2
3 key3
I need to get the following a result:
listing_id keys_that_does_not_exist
523 key2
523 key3
2433 key3
53 key3
3 key2
I tried:
ids_without_keys_q = session.query(Table)\
.filter(~ exists().where(Table.key.in_(list_of_values))
I am using postgresql
Upvotes: 3
Views: 922
Reputation: 13129
The question is how many return values you expect. PostgreSQL is not very good with generated data, like this solution uses, so if it's too slow, it might be faster to just fetch a list of all combinations and use Python to find the non-existing ones.
Also, I assume here based on your query that there is one row per listing_id/key pair, and that you're not storing keys as a string array. If that is the case, let me know and I'll revise the answer.
func.count()
to filter out all listings that do have matches to all keys:unique_incomplete_listings = session.query(Table.listing_id.label('listing_id'))
.group_by(Table.listing_id)
.having(func.count() < 3)
from sqlalchemy.dialects.postgresql import array
unique_incomplete_listings = unique_incomplete_listings.cte()
all_potential_matches = session.query(
unique_incomplete_listings.c.listing_id,
# this creates a cartesian product of listing_ids to elements in the array
func.unnest(array(['key1', 'key2', 'key3']))
)
query = all_potential_matches.except_all(
session.query(Table.listing_id, Table.key)
# We join here to prevent doing extra work if the listing does have all keys.
# If this makes the query slower, consider making the CTE a subquery and
# removing this join instead
.join(unique_incomplete_listings,
unique_incomplete_listings.c.listing_id == Table.listing_id)
)
Upvotes: 3