Reputation: 3213
is there more accurate algorithm than "Levenshtein distance" algorithm?? http://en.wikipedia.org/wiki/Levenshtein_distance
Upvotes: 0
Views: 744
Reputation: 29562
There is the Damerau–Levenshtein distance, which adds support for character transpositions and providing more coverage for common typos.
To get a similarity percentage for Levenshtein or Damerau-Levenshtein do something like this:
int relative_similarity = 1.0 - 1.0 / ((len(x) + len(y)) / 2) * lev(x, y); //untested
Alternatively you might want to take a look at the longest common subsequence as a metric of similarity.
Next there are
which are phonetic matching algorithms.
While Smith
and its german counterpart Schmidt
would turn up as quite different using the edit distance (a.k.a Levenshtein), Soundex and MEtaphone would consider them phonetically similar or even equivalent.
But without you telling us what's wrong about the pure Levenshtein distance it's hard to guess a better algorithm.
Upvotes: 4