Reputation: 4960
I am trying to search words in a string but my output is false because "men" and "shirt" does not match in the string because of plural factor. What i am really looking for is to match "men" with "mens" and "shirt" with "shirts". How can I do that and if there is a easy way to accomplish this in python then please share.
strings = ['get-upto-70-off-on-mens-t-shirts']
words = ['men','shirt']
print map(lambda x: all(map(lambda y:y in x.split(),words)),strings)
Output
False
Upvotes: 0
Views: 296
Reputation: 195438
One possibility is using Python's builtin difflib
module. The function get_close_matches()
(doc) might need some tuning:
import difflib
strings = ['get-upto-70-off-on-mens-t-shirts']
words = ['men','shirt']
for w in words:
for s in strings:
s = s.split('-')
m = difflib.get_close_matches(w, s)
print('Word: "{}" Close matches: {}'.format(w, m))
Prints:
Word: "men" Close matches: ['mens']
Word: "shirt" Close matches: ['shirts']
Upvotes: 1
Reputation: 5015
You can either use lemmatization in NTLK
library (remove 's' 'ing' etc) or Fuzzy String Match using FUZZYWUZZY
library.
Upvotes: 0