Reputation: 984
I need to check if the string distance (Measure the minimal number of changes - character removal, addition, and transposition) between two strings in python is greater than 1.
I can implement it on my own, but I bet there are existing packages that would save me from implementing that on my own. I wasn't able to find any such package I could identify as commonly used. Are there any?
Upvotes: 6
Views: 4718
Reputation: 69
Yes. strsimpy can be used. Check out here - https://pypi.org/project/strsimpy/ I hope this is what you are looking for. Here is a usage example:
from strsimpy.levenshtein import Levenshtein
levenshtein = Levenshtein()
levenshtein.distance('1234', '123') # 1 (deletion/insertion)
levenshtein.distance('1234', '12345') # 1 (deletion/insertion)
levenshtein.distance('1234', '1235') # 1 (substitution)
levenshtein.distance('1234', '1324') # 2 (substitutions)
levenshtein.distance('1234', 'ABCD') # 4 (substitutions)
There are a lot of other metrics available.
Upvotes: 2
Reputation: 255
There is a NLTK package which you can use, it uses the Levenshtein edit-distance which should be what you're looking for.
Example:
import nltk
s1 = "abc"
s2 = "ebcd"
nltk.edit_distance(s1, s2) # output: 2
Reference: https://tedboy.github.io/nlps/generated/generated/nltk.edit_distance.html
Upvotes: 8
Reputation: 2847
There are many implementations of the corresponding algorithm you need: the following belongs to a well documented library called NLTK.
https://www.nltk.org/_modules/nltk/metrics/distance.html
Upvotes: 1