Reputation: 31
Im writing a function to handle multiple queries in a boolean AND search.
I have a dict of docs where each query occurs= query_dict
I want the intersection of all values in the query_dict.values():
query_dict = {'foo': ['doc_one.txt', 'doc_two.txt', 'doc_three.txt'],
'bar': ['doc_one.txt', 'doc_two.txt'],
'foobar': ['doc_two.txt']}
intersect(query_dict)
>> doc_two.txt
I've been reading about intersection but I'm finding it hard to apply it to a dict.
Thanks for your help!
Upvotes: 3
Views: 6159
Reputation: 113955
In [36]: query_dict = {'foo': ['doc_one.txt', 'doc_two.txt', 'doc_three.txt'],
'bar': ['doc_one.txt', 'doc_two.txt'],
'foobar': ['doc_two.txt']}
In [37]: reduce(set.intersection, (set(val) for val in query_dict.values()))
Out[37]: set(['doc_two.txt'])
In [41]: query_dict = {'foo': ['doc_one.txt', 'doc_two.txt', 'doc_three.txt'], 'bar': ['doc_one.txt', 'doc_two.txt'], 'foobar': ['doc_two.txt']}
set.intersection(*(set(val) for val in query_dict.values()))
is also a valid solution, though it's a bit slower:
In [42]: %timeit reduce(set.intersection, (set(val) for val in query_dict.values()))
100000 loops, best of 3: 2.78 us per loop
In [43]: %timeit set.intersection(*(set(val) for val in query_dict.values()))
100000 loops, best of 3: 3.28 us per loop
Upvotes: 14
Reputation: 9858
Another way
first = query_dict.values()[0]
rest = query_dict.values()[1:]
print [t for t in set(first) if all(t in q for q in rest)]
Upvotes: 0