Reputation: 9096
I'm using Python fuzzywuzzy
to find matches in a list of sentences:
def getMatches(needle):
return process.extract(needle, bookSentences, scorer=fuzz.token_sort_ratio, limit=3)
I'm trying to print out the match plus the sentences around it:
for match in matches:
matchIndex = bookSentences.index(match)
sentenceIndices = range(matchIndex-2,matchIndex+2)
for index in sentenceIndices:
print bookSentences[index],
print '\n\n'
Unfortunately, the script fails to find the match in the original list:
ValueError: (u'Thus, in addition to the twin purposes mentioned above, this book is written for at least two groups: 1.', 59) is not in list
Is there a better way to find the index of the match in the original list? Can fuzzywuzzy
some how give it to me? There doesn't seem to be anything in the readme about it.
How can I get the index in the original list of a match returned by fuzzywuzzy
?
Upvotes: 4
Views: 5368
Reputation: 9096
I feel a bit dumb. fuzzywuzzy
returns a tuple including the score, not just the match. The solution:
for match in matches:
matchIndex = bookSentences.index(match[0])
sentenceIndices = range(matchIndex-2,matchIndex+2)
for index in sentenceIndices:
print bookSentences[index],
print '\n\n'
Upvotes: 3