Yuming Cao
Yuming Cao

Reputation: 955

fastest way to search python dict with partial keyword

What is the fastest way to determine if a dict contains a key starting with a particular string? Can we do better than linear? How can we achieve an O(1) operation when we only know the start of a key?

Here is the current solution:

for key in dict.keys():
    if key.start_with(str):
        return True
return False

Upvotes: 26

Views: 39206

Answers (2)

Jakub M.
Jakub M.

Reputation: 33857

You could put all the prefixes of the inserted keys to the dict, so for key foo you would insert f, fo and foo. You would have O(1) lookup, but you would spend time on preprocessing (O(k), where k is a key length), and waste lots of memory:

def insert_with_prefixes(key, value, dict_):
  prefixes = (key[:i+1] for i in xrange(len(key)))
  dict_.update((prefix, value) for prefix in prefixes)

For everyday use I would go (and I go) with the method in arshajii's answer. And of course, have in mind possible many collisions for short prefixes (here: "h"):

>>> a = {}
>>> insert_with_prefixes('hello', 'world', a)
>>> insert_with_prefixes('homo', 'sapiens', a)
>>> a
{'h': 'sapiens', 'hom': 'sapiens', 'homo': 'sapiens', 'ho': 'sapiens', 
 'hel': 'world', 'hell': 'world', 'hello': 'world', 'he': 'world'}

Upvotes: 0

arshajii
arshajii

Reputation: 129537

Without preprocessing the dict, O(n) is the best you can do. It doesn't have to be complicated, though:

any(key.startswith(mystr) for key in mydict)

(Don't use dict and str as variable names, those are already the names of two built-in functions.)

If you can preprocess the dict, consider putting the keys in a prefix tree (aka trie). There is even a Python implementation in the Wikipedia article.

Upvotes: 47

Related Questions