Reputation: 2151
i'm searching for an algorithm for computing Levenshtein edit distance that also supports the case in which two adjacent letters are transposed that is implemented in C#.
for example the word "animals" and "ainmals" : switching between the letters "n" and "i" wont be scored as two replacements -which will make a big distance - but instead on will be scored as a transpose of two letters -much more less distance-
what i reached so far in searching
Upvotes: 1
Views: 4323
Reputation: 81
You need to add the additional condition to make it a "Damerau–Levenshtein distance" algorithm. So, using the example here: http://www.dotnetperls.com/levenshtein you just need to add the following condition right after step 6:
//** Step 7 to make it Damerau–Levenshtein distance
if (i > 1 && j > 1 && (s[i - 1] == t[j - 2]) && (s[i - 2] == t[j - 1]))
{
d[i, j] = Math.Min(
d[i, j],
d[i - 2, j - 2] + cost // transposition
);
}
Upvotes: 1
Reputation: 2579
See the implementation on Wikipedia. You can easily adapt the algorithm to include the case for letter swaps. For example:
//bla bla. I'm just copying the code on the Wikipedia.
d[i, j] := minimum
(
d[i-1, j] + 1, // a deletion
d[i, j-1] + 1, // an insertion
d[i-1, j-1] + 1, // a substitution
)
// This single statement is all you need:
if(s[i-1]==t[j-2] && s[i-2]==t[j-1])
d[i,j] := minimum
(
d[i,j], //cost without swapping
d[i-2,j-2]+something //cost with swapping. probably something=1
);
Upvotes: 6