Reputation: 5922
In Python 2.7, I want to check the similarity between one string, and strings in a list, until it finds a match.
from difflib import SequenceMatcher
def similar(a, b):
return SequenceMatcher(None, a, b).ratio()
correctList = ["thanks", "believe", "definitely"]
myString = "thansk"
for correctWord in correctList:
ratio = similar(correctWord, myString)
if ratio > 0.9:
myString = correctWord
break
print myString
>>> "thanks"
I would like to simplify the for
iteration into fewer lines, to something like:
if similar(myString, any([correctWord for correctWord in correctList])) > 0.9:
myString = correctWord
I'm not entirely sure about the correct logic here, but in either case variants of this syntax throws the error:
TypeError: ("'bool' object is not iterable", u'occurred at index 0')
What would be the proper way to achieve this?
Upvotes: 0
Views: 691
Reputation: 29646
any
should only take a list of boolean expressions, so we need to find a way to evaluating similar
between myString
and every element of correctList
first. We can use map
here alongside a predicate lambda s: similar(myString, s) > 0.9
:
any(map(lambda s: similar(s, myString) > 0.9, correctList))
This evaluates to True
if there is at least one element of correctList
'similar enough' to myString
.
... but you'll notice that we want to determine which elements of correctList
are similar to myString
, so perhaps we should really be using filter
:
candidates = filter(lambda s: similar(s, myString) > 0.9, correctList)
You could just take the first result, in which case the next
expression would work, but it wouldn't necessarily be the most similar element of correctList
.
We can, however, use map
, filter
, and max
to accomplish that. Consider:
pairs = map(lambda s: (s, similar(s, myString)), correctList)
returns a list of pairs, each consisting of an element of correctList
and its 'degree of similarity' with myString
. We can then filter out the candidates with similarity below 0.9
:
pairs = filter(lambda (s, d): d > 0.9, pairs)
and lastly we choose the candidate with maximum similarity from those remaining (using operator.itemgetter
as our key function) or myString
if none exists:
myString = (max(pairs, key = itemgetter(1)) or [myString])[0]
Of course, we could also use max
without pre-filtering and then disregard the answer if its similarity is insufficient:
pairs = map(lambda s: (s, similar(s, myString)), correctList)
candidate = max(pairs, key = itemgetter(1))
myString = candidate[0] if candidate[1] > 0.9 else myString
Upvotes: 2
Reputation: 58294
You are stopping as soon as you find the first string with similarity > 0.9, starting with "thansk"
as a candidate. So I think this is equivalent:
myString = "thansk"
myString = next((w for w in correctList if similar(w, myString) > 0.9), myString)
Upvotes: 2