Reputation: 111
list = ['Apple','Banana','Cucumber']
string = 'The other day as I ate a bnana in the park'
for x in range(len(list)):
if list[x] in string:
do a little dance
This is the gist of my code as it stands now, though my actual string and list are much longer. The string is user submitted so I have to expect misspelling/shorthand/CAPS and short of filling my list with every misspelling I can think of or parsing each word of the string, I'm not sure how to solve this problem.
I'm looking for a fuzzy contains if statement. I've looked through fuzzywuzzy documentation and I'm not sure how to make it work in this case.
Is there any function like this?
threshold = 80
for x in range(len(list):
if fuzzy.contain(list[x],string) > threshold:
do a little dance:
I appreciate any help.
Upvotes: 0
Views: 241
Reputation: 695
From the documentation:
threshold = 80
for x in range(len(list)):
if fuzzy.WRatio(list[x],string) > threshold:
do a little dance:
*Disclaimer I've never used fuzzy
before, but that should work.
Upvotes: 1
Reputation: 4510
I couldn't find a contain
method in the fuzzywuzzy documentation
, so I came up with this. You split the phrase by words, and then compare each word in a fuzzy
way. Depending on your special needs you should use other rating methods instead of token_sort_ratio
and threshold
value. You can find more information in their github.
from fuzzywuzzy import fuzz
def fuzzy_contains_word(word, phrase, threshold):
for phrase_word in phrase.split():
if fuzz.token_sort_ratio(word, phrase_word) > threshold:
return True
return False
words = ['Apple','Banana', 'Cucumber']
user_input = 'The other day as I ate a bnana in the park'
threshold = 80
for word in words:
if fuzzy_contains_word(word, user_input, 80):
print(word, 'found in phrase: ', user_input)
>>> Banana found in phrase: The other day as I ate a bnana in the park
Note: I got a warning from this saying you should install python-Levenshtein
package.
Upvotes: 1