Reputation: 1885
Trying to make this work is making my head spin:
I have an ordered dict:
OrderedDict([('key', {'keyword': {'blue', 'yellow'}), ('key1', {'keyword': {'lock', 'door'})])
I have a list of potential_matches
: [red, blue, one]
I want to order these potential matches into one of two lists:
correct = []
or incorrect = []
If the potential match is a keyword of one of the keys in the dict, then it goes in correct
, else it goes in incorrect
.
Result of this example should be:
correct = [blue]
, incorrect = [red, one]
Here is what I tried:
correct = []
incorrect = []
for word in potential_matches:
for key, value in ordered_dict.items():
if word in value["keyword"] and word not in correct:
correct.append(word)
elif word not in value["keyword"] and word not in correct and word not in incorrect:
incorrect.append(word)
the lists cannot have overlap and must have unique items, that's why there are so many checks in the elif
.
It's close, but what ends up happening is that the incorrect list will still have items from the correct list.
How can I solve this as efficiently as possible?
I made it sound a bit complicated, but essentially, all remaining words that are not a match should simply go to the other list. This would require one full run through the potential_match
list & dict, though, I think..
Upvotes: 3
Views: 92
Reputation: 164703
Your logic works correctly when I run it, so there may be some logic which you have not supplied causing the error.
But, since you are working with collections of unique items, you can more efficiently implement your logic using set
rather than list
.
In addition, instead of cycling through potential_matches
, loop through your dictionary and add items to a correct
set. This reduces your complexity from O(m * n) to O(n), i.e. number of elements in lowest level dictionary values.
Then, at the very end, use set.difference
, or syntactic sugar -
, to calculate the incorrect
set. Here's a demo:
from collections import OrderedDict
d = OrderedDict([('key', {'keyword': {'blue', 'yellow'}}),
('key1', {'keyword': {'lock', 'door'}})])
potential_matches = {'red', 'blue', 'one'}
correct = set()
for v in d.values():
for w in v['keyword']:
if w in potential_matches:
correct.add(w)
incorrect = potential_matches - correct
Result:
print(correct, incorrect, sep='\n')
{'blue'}
{'one', 'red'}
A more efficient version of this is possible via a set
comprehension:
potential_matches = {'red', 'blue', 'one'}
correct = {w for v in d.values() for w in v['keyword'] if w in potential_matches}
incorrect = potential_matches - correct
Note that the structure of the nested set comprehension is aligned with how the verbose nested for
loop is written.
Upvotes: 6