Reputation: 955
The Levenshtein distance to convert SATURDAY to SUNDAY is 3. One way is, delete A at pos 2, delete T at pos 3, substitute R at pos 5 by N.
If I take position number as the weight for positions, the cost will be:
2 + 3 + 5 = 10
Similarly, if I convert SUNDAY to SATURDAY, it requires 2 insertions at pos 2 and 1 substitution at 3, the cost will be:
2 + 2 + 3 = 7
As you can see, this modified Levenshtein is not symmetric.
Note that the positions are relative to the original string, even after insertions.
Is there any way to take the positions of letters into consideration, and keeping Levenshtein symmetric, so that lev(x,y) = lev(y,x)
?
I saw some questions already posted, but they weren't really helpful: Modifying Levenshtein Distance for positional Bias
Thanks in advance.
Upvotes: 0
Views: 262
Reputation: 178411
One approach to take an asymmetrical property and make it symmetrical is to use the average (or other measure) of them: d'(x,y) = d'(y,x) = (d(x,y) + d(y,x))/2
One familiar usage of such technique is making the KL-Divergence symmetrical by applying the same technique. This is known as the Jensen-Shannon Diversion
Upvotes: 1