Reputation: 631
I'm trying to port the code from coding for crosswords from C++ to Python.
I have a dictionary where the keys are 12000 english words and the values for each key is a list of possible combinations of the key with missing letter. For example, the key-value pairs for "DOG" and "DAD" are:
my_dic = {
"DOG": ["DOG", "DO-", "D-G", "D--", "-OG", "--G", "-O-", "---"],
"DAD": ["DAD", "DA-", "D-D", "D--", "-AD", "--D", "-A-", "---"],
...
}
I'm trying to create a sort of "hash table" for which, given a value, it will return all possible keys that will match. For example, if I ask for "D--", the function will return "DOG", and "DAD".
I can flatten all values and add the key to a list using a list comprehension, but is it the fastest method?
def find_word(s):
my_keys = []
tmp = [k,v for sublist in list(my_dic.values() for k,v in sublist]
if s in tmp:
my_keys.append(k)
return my_keys
Upvotes: 2
Views: 335
Reputation: 22360
You could just return a list comprehension, note this has the same time complexity as your current solution:
my_dict = {
"DOG": ["DOG", "DO-", "D-G", "D--", "-OG", "--G", "-O-", "---"],
"DAD": ["DAD", "DA-", "D-D", "D--", "-AD", "--D", "-A-", "---"]
}
def find_keys(search_value):
return [
key
for key, values in my_dict.items()
if search_value in values
]
If you will be repeatedly looking for keys in my_dict
you might want to make a data structure that is the reverse mapping for O(1)
lookup time using collections.defaultdict
, Note the creation of this reverse mapping dict
is O(M*N)
where M
is the number of keys and N
is the length of the longest list, so you should only do it once, and not inside the find_keys
function itself:
from collections import defaultdict
my_dict = {
"DOG": ["DOG", "DO-", "D-G", "D--", "-OG", "--G", "-O-", "---"],
"DAD": ["DAD", "DA-", "D-D", "D--", "-AD", "--D", "-A-", "---"]
}
my_dict_reversed = defaultdict(list)
for key, values in my_dict.items():
for value in values:
my_dict_reversed[value].append(key)
def find_keys(search_value):
return my_dict_reversed[search_value]
Example Usage:
>>> find_keys("D--")
['DOG', 'DAD']
Lastly, as a side note, you may want to consider passing in the dict
as an argument to find_keys
rather than call a globally declared variable dict
for reusability, But I believe that is going outside of the scope of the original question.
Upvotes: 1
Reputation: 95908
Well, a dict
is a hash table, but it maps keys to values. You want the reverse mapping:
my_dic = {
"DOG": ["DOG", "DO-", "D-G", "D--", "-OG", "--G", "-O-", "---"],
"DAD": ["DAD", "DA-", "D-D", "D--", "-AD", "--D", "-A-", "---"],
}
rev_dic = {}
for k, vs in my_dic.items():
for v in vs:
rev_dic.setdefault(v, []).append(k)
Then use this dict:
print(rev_dic["D--"])
Of course, don't do the reverse mapping every time you want to make this sort of query, create this dict separately and ahead of time and keep re-using it.
So:
def find_words(s, default=None):
return rev_dic.get(s)
Or, you could raise an error, maybe a ValueError:
def find_words(s):
try:
return re_dic[s]
except KeyError:
raise ValueError("Some helpful message")
Now your look-up is constant time (of course, you've used auxiliary space, a class trade-off).
Upvotes: 2