Spooknik
Spooknik

Reputation: 13

How can I fuzzy search with a keyword and return the matched substring?

I'd like to be able to find and replace in a fuzzy way. So I need to do a fuzzy search of text and return a fuzzy match to a keyword, but i'm struggling to find an implementation for this. For example, I would like to do something like this:

text = 'The sunset is a lovely colour this evening'
keyword = 'Color'
desired_result = colour
text.replace(desired_result, keyword)
print(text)
The sunset is a lovely Color this evening

To complicate matters the phrases that need to be replaced could be more than one word, so splitting won't work.

I've tried FuzzyWuzzy's process function, but this only will return the keyword not the match. For example:

process.extractOne("This sunset is a lovely colour this evening", "Color")
("Color", 90)

I need the match in the text so I can replace.

Python's Regex can do fuzzy matching but performance is a concern and it doesn't seem to work for me with full phrase.

text = 'The sunset is a lovely colour this evening'
term = 'Color'
r = regex.compile('('+text +'){e<=5}')
print(r.match(term ))
None

Upvotes: 1

Views: 3755

Answers (1)

Sayse
Sayse

Reputation: 43300

If you're using fuzzy search you can use find_near_matches to get the indices of matches, and then use a list comprehension from that to get the actual strings used

from fuzzysearch import find_near_matches
my_string = 'aaaPATERNaaa'
matches = find_near_matches('PATTERN', my_string, max_l_dist=1)

print([my_string[m.start:m.end] for m in matches])

Upvotes: 2

Related Questions