Maxwell's Daemon
Maxwell's Daemon

Reputation: 631

find all keys for a given value

I'm trying to port the code from coding for crosswords from C++ to Python.

I have a dictionary where the keys are 12000 english words and the values for each key is a list of possible combinations of the key with missing letter. For example, the key-value pairs for "DOG" and "DAD" are:

my_dic = {
    "DOG": ["DOG", "DO-", "D-G", "D--", "-OG", "--G", "-O-", "---"],
    "DAD": ["DAD", "DA-", "D-D", "D--", "-AD", "--D", "-A-", "---"],
    ...
}

I'm trying to create a sort of "hash table" for which, given a value, it will return all possible keys that will match. For example, if I ask for "D--", the function will return "DOG", and "DAD".

I can flatten all values and add the key to a list using a list comprehension, but is it the fastest method?

def find_word(s):
    my_keys = []
    tmp = [k,v for sublist in list(my_dic.values() for k,v in sublist]
    if s in tmp:
        my_keys.append(k)
    return my_keys

Upvotes: 2

Views: 335

Answers (2)

Sash Sinha
Sash Sinha

Reputation: 22360

You could just return a list comprehension, note this has the same time complexity as your current solution:

my_dict = {
    "DOG": ["DOG", "DO-", "D-G", "D--", "-OG", "--G", "-O-", "---"],
    "DAD": ["DAD", "DA-", "D-D", "D--", "-AD", "--D", "-A-", "---"]
}

def find_keys(search_value):
    return [
        key
        for key, values in my_dict.items()
        if search_value in values
    ]

If you will be repeatedly looking for keys in my_dict you might want to make a data structure that is the reverse mapping for O(1) lookup time using collections.defaultdict, Note the creation of this reverse mapping dict is O(M*N) where M is the number of keys and N is the length of the longest list, so you should only do it once, and not inside the find_keys function itself:

from collections import defaultdict

my_dict = {
    "DOG": ["DOG", "DO-", "D-G", "D--", "-OG", "--G", "-O-", "---"],
    "DAD": ["DAD", "DA-", "D-D", "D--", "-AD", "--D", "-A-", "---"]
}

my_dict_reversed = defaultdict(list)
for key, values in my_dict.items():
    for value in values:
        my_dict_reversed[value].append(key)

def find_keys(search_value):
    return my_dict_reversed[search_value]

Example Usage:

>>> find_keys("D--")
['DOG', 'DAD']

Lastly, as a side note, you may want to consider passing in the dict as an argument to find_keys rather than call a globally declared variable dict for reusability, But I believe that is going outside of the scope of the original question.

Upvotes: 1

juanpa.arrivillaga
juanpa.arrivillaga

Reputation: 95908

Well, a dict is a hash table, but it maps keys to values. You want the reverse mapping:

my_dic = {
    "DOG": ["DOG", "DO-", "D-G", "D--", "-OG", "--G", "-O-", "---"],
    "DAD": ["DAD", "DA-", "D-D", "D--", "-AD", "--D", "-A-", "---"],
}

rev_dic = {}
for k, vs in my_dic.items():
    for v in vs:
        rev_dic.setdefault(v, []).append(k)

Then use this dict:

print(rev_dic["D--"])

Of course, don't do the reverse mapping every time you want to make this sort of query, create this dict separately and ahead of time and keep re-using it.

So:

def find_words(s, default=None):
    return rev_dic.get(s)

Or, you could raise an error, maybe a ValueError:

def find_words(s):
    try:
        return re_dic[s]
    except KeyError:
        raise ValueError("Some helpful message")

Now your look-up is constant time (of course, you've used auxiliary space, a class trade-off).

Upvotes: 2

Related Questions