Jay Jung
Jay Jung

Reputation: 1885

How do I check membership of items in a dict and list, within nested for loops?

Trying to make this work is making my head spin:
I have an ordered dict:

OrderedDict([('key', {'keyword': {'blue', 'yellow'}), ('key1', {'keyword': {'lock', 'door'})])

I have a list of potential_matches: [red, blue, one]

I want to order these potential matches into one of two lists:
correct = [] or incorrect = []

If the potential match is a keyword of one of the keys in the dict, then it goes in correct, else it goes in incorrect.

Result of this example should be:
correct = [blue], incorrect = [red, one]

Here is what I tried:

correct = []  
incorrect = []  
for word in potential_matches:
    for key, value in ordered_dict.items():
        if word in value["keyword"] and word not in correct:
            correct.append(word)
        elif word not in value["keyword"] and word not in correct and word not in incorrect:
            incorrect.append(word)  

the lists cannot have overlap and must have unique items, that's why there are so many checks in the elif.
It's close, but what ends up happening is that the incorrect list will still have items from the correct list.

How can I solve this as efficiently as possible?

I made it sound a bit complicated, but essentially, all remaining words that are not a match should simply go to the other list. This would require one full run through the potential_match list & dict, though, I think..

Upvotes: 3

Views: 92

Answers (1)

jpp
jpp

Reputation: 164703

Your logic works correctly when I run it, so there may be some logic which you have not supplied causing the error.

But, since you are working with collections of unique items, you can more efficiently implement your logic using set rather than list.

In addition, instead of cycling through potential_matches, loop through your dictionary and add items to a correct set. This reduces your complexity from O(m * n) to O(n), i.e. number of elements in lowest level dictionary values.

Then, at the very end, use set.difference, or syntactic sugar -, to calculate the incorrect set. Here's a demo:

from collections import OrderedDict

d = OrderedDict([('key', {'keyword': {'blue', 'yellow'}}),
                 ('key1', {'keyword': {'lock', 'door'}})])

potential_matches = {'red', 'blue', 'one'}

correct = set()
for v in d.values():
    for w in v['keyword']:
        if w in potential_matches:
            correct.add(w)

incorrect = potential_matches - correct

Result:

print(correct, incorrect, sep='\n')

{'blue'}
{'one', 'red'}

A more efficient version of this is possible via a set comprehension:

potential_matches = {'red', 'blue', 'one'}
correct = {w for v in d.values() for w in v['keyword'] if w in potential_matches}
incorrect = potential_matches - correct

Note that the structure of the nested set comprehension is aligned with how the verbose nested for loop is written.

Upvotes: 6

Related Questions