Gustavo Costa
Gustavo Costa

Reputation: 111

Fuzzy checking if each item of a list is contained in a given string

list = ['Apple','Banana','Cucumber']
string = 'The other day as I ate a bnana in the park'

for x in range(len(list)):
    if list[x] in string:
        do a little dance

This is the gist of my code as it stands now, though my actual string and list are much longer. The string is user submitted so I have to expect misspelling/shorthand/CAPS and short of filling my list with every misspelling I can think of or parsing each word of the string, I'm not sure how to solve this problem.

I'm looking for a fuzzy contains if statement. I've looked through fuzzywuzzy documentation and I'm not sure how to make it work in this case.

Is there any function like this?

threshold = 80
for x in range(len(list):
     if fuzzy.contain(list[x],string) > threshold:
         do a little dance:

I appreciate any help.

Upvotes: 0

Views: 241

Answers (2)

Felipe Gutierrez
Felipe Gutierrez

Reputation: 695

From the documentation:

threshold = 80
for x in range(len(list)):
     if fuzzy.WRatio(list[x],string) > threshold:
         do a little dance:

*Disclaimer I've never used fuzzy before, but that should work.

Upvotes: 1

marcos
marcos

Reputation: 4510

I couldn't find a contain method in the fuzzywuzzy documentation, so I came up with this. You split the phrase by words, and then compare each word in a fuzzy way. Depending on your special needs you should use other rating methods instead of token_sort_ratio and threshold value. You can find more information in their github.

from fuzzywuzzy import fuzz

def fuzzy_contains_word(word, phrase, threshold):
    for phrase_word in phrase.split():
        if fuzz.token_sort_ratio(word, phrase_word) > threshold:
            return True
    return False


words = ['Apple','Banana', 'Cucumber']
user_input = 'The other day as I ate a bnana in the park'
threshold = 80

for word in words:
    if fuzzy_contains_word(word, user_input, 80):
        print(word, 'found in phrase: ', user_input)

>>> Banana found in phrase:  The other day as I ate a bnana in the park

Note: I got a warning from this saying you should install python-Levenshtein package.

Upvotes: 1

Related Questions