Xianghui Huang
Xianghui Huang

Reputation: 49

Python dictionary keywords for key

For example, A dictionary is provided as nba_dict = {'lakers':100,'Warriors':99}. The keys of the dict are manually input by users. And we can see there exist a typo diff. Upper case for 'Warriors' and lower case for 'lakers'. The goal is to get the value by the keys. But I want to have a function to handle these typo errors, For example, if I do nba_dict['Aker'] and it will provide me 100.

Upvotes: 1

Views: 79

Answers (1)

Valentin Goldité
Valentin Goldité

Reputation: 1219

You can use Levenshtein distance to calculate the minimum number of "single edits" to change the first string into the second one. There is a convenient implementation in C with Python interface:

pip install python-Levenshtein

Then you can compare the distance between your query and all the keys, find the minimum one and return the corresponding value. You can also return an error if the minimal distance is above an empirical threshold to filter "too bad" requests.

from Levenshtein import distance as lev_dist


class FlexibleDict(dict):

    def __init__(self, _threshold=3, **kwargs):  # change the threshold
        super().__init__(**kwargs)
        self.threshold = _threshold

    def __getitem__(self, query):
        if query in self:
            return super().__getitem__(query)
        keys = list(self.keys())
        min_dist, best_key = lev_dist(query, keys[0]), keys[0]
        for key in keys[1:]:
            dist = lev_dist(query, key)
            if dist < min_dist:
                min_dist, best_key = dist, key
        if min_dist <= self.threshold:
            return super().__getitem__(best_key)
        raise ValueError(f"key '{query}' not found")

nba_dict = FlexibleDict(lakers=100, warriors=90)
# to change the threshold
# nba_dict = FlexibleDict(_threshold=2, lakers=100, warriors=90)


print(nba_dict['Aker'])  # 100
print(nba_dict['gege'])  # error

Except for initialization, FlexibleDict objects can be manipulated exactly as vanilla dict objects.

Upvotes: 1

Related Questions