Reputation:
I wrote this algorithm to calculate the sum of the number of deletion and insert (so, of the edits) to make the first string equals to the second one. But it doesn't work.
public static int distance (String s1, String s2) {
return distance(s1, s2, 0, 0);
}
private static int distance(String s1, String s2, int i, int j) {
if (i == s1.length) return j;
if (j == s2.length) return i;
if (s1.charAt(i) == s2.charAt(j))
return distance(s1, s2, i + 1, j + 1);
int rep = distance(s1, s2, i + 1, j + 1) + 1;
int del = distance(s1, s2, i, j + 1) + 1;
int ins = distance(s1, s2, i + 1, j) + 1;
return Math.min(del, Math.min(ins, rep));
}
EDIT: Example String 1: "casa" String 2: "cara" edit_distance=2 (1 deletion + 1 insert)
EDIT2: These are the strings that work: String 1: "casa", String 2: "cassa", edit_distance=1; String 1:"pioppo", String 2: "pioppo", edit_distance=0;
These are the one which doesn't work: String 1: "casa", String 2: "cara", edit_distance=2; (in my code=0) String 1: "tassa", String 2: "passato", edit_distance=4; (in my code=2)
Upvotes: 0
Views: 2875
Reputation: 1
It works with me:
private static int distance(String s1, String s2, int i, int j) {
if (i == s1.length() && j == s2.length()) {
return 0;
} else if (i == s1.length()) {
return s2.length() - j;
} else if (j == s2.length()) {
return s1.length() - i;
}
if (s1.charAt(i) == s2.charAt(j)) {
return distance(s1, s2, i + 1, j + 1);
}
// int rep = distance(s1, s2, i + 1, j + 1) + 1;
int del = distance(s1, s2, i, j + 1) + 1;
int ins = distance(s1, s2, i + 1, j) + 1;
// return Math.min(del, Math.min(ins, rep));
return Math.min(del, ins);
}
There is the test and it works too:
/**
* Test of distanceRec method, of class EditDistance.
*/
@Test
public void testDistanceRec() {
System.out.println("distanceRec");
String s1 = "passato";
String s2 = "tassa";
int expResult = 4;
int result = EditDistance.distanceRec(s1, s2);
assertEquals(expResult, result);
// Review the generated test code and remove the default call to fail.
//fail("The test case is a prototype.");
}
In this app you can only use two operations: insert and delete, no other operatios like replace or match. Text of exercise:
Assume that the operations available are only two: deletion and insertion of a character. Examples: - "casa" and "cassa" have edit distance equal to 1 (1 cancellation); - "casa" and "cara" have edit distance equal to 2 (1 cancellation + 1 insertion); - "tax" and "past" have edit distance equal to 4 (3 cancellations + 1 insertion); - "poplar" and "poplar" have edit distance of 0.
Upvotes: 0
Reputation: 14621
I think that the implementation is almost correct and you missed the stop conditions. They should be:
if (j == s2.length()) {
return s1.length() - i;
}
if (i == s1.length()) {
return s2.length() - j;
}
So the full implementation should be:
private static int distance(String s1, String s2, int i, int j) {
if (j == s2.length()) {
return s1.length() - i;
}
if (i == s1.length()) {
return s2.length() - j;
}
if (s1.charAt(i) == s2.charAt(j))
return distance(s1, s2, i + 1, j + 1);
int rep = distance(s1, s2, i + 1, j + 1) + 2; // since Jim Belushi considers replacement to be worth 2.
int del = distance(s1, s2, i, j + 1) + 1;
int ins = distance(s1, s2, i + 1, j) + 1;
return Math.min(del, Math.min(ins, rep));
}
Update
Here is the result for "tassa" and "passato":
Code:
private static int distance(String s1, String s2, int i, int j) {
if (j == s2.length()) {
return s1.length() - i;
}
if (i == s1.length()) {
return s2.length() - j;
}
if (s1.charAt(i) == s2.charAt(j))
return distance(s1, s2, i + 1, j + 1);
int rep = distance(s1, s2, i + 1, j + 1) + 2;
int del = distance(s1, s2, i, j + 1) + 1;
int ins = distance(s1, s2, i + 1, j) + 1;
return Math.min(del, Math.min(ins, rep));
}
public static void main(String[] args) {
int dist = distance("tassa", "passato", 0, 0);
System.out.println(dist);
}
If you run this you get:
4
Upvotes: 1
Reputation: 1498
Two easy changes and your code works:
First:
if (i == s1.length()) return s2.length() - j;
if (j == s2.length()) return s1.length() - i;
instead of
if (i == s1.length()) return j;
if (j == s2.length()) return i;
Next:
int rep = distance(s1, s2, i + 1, j + 1) + 2;
The 2 at the end is important here. If rep means replace, it is a remove AND an insert. Making it two operations, not 1.
Upvotes: 0
Reputation: 4983
you need to specify how to continue when you get to the end of one string but not the other, try this
public static void main(String[] args) {
System.out.println(distance("casa","cassa"));
}
public static int distance (String s1, String s2) {
return distance(s1, s2, 0, 0);
}
private static int distance(String s1, String s2, int i, int j) {
if (i == s1.length() && j==s2.length())
return 0;
else if(i== s1.length())
return s2.length() - j;
else if(j == s2.length())
return s1.length() - i;
if (s1.charAt(i) == s2.charAt(j))
return distance(s1, s2, i + 1, j + 1);
int rep = distance(s1, s2, i + 1, j + 1) + 1;
int del = distance(s1, s2, i, j + 1) + 1;
int ins = distance(s1, s2, i + 1, j) + 1;
return Math.min(del, Math.min(ins, rep));
}
output
1
Note: the first if
is not necessary, just make the code more understandable... remove it in your impl
Upvotes: 0
Reputation: 1277
That should to be what you want
If every edit of char means distance+2 (= delete + add), it also adds count of characters added/removed- but only +1, not +2
//get number of deletions / edits - inc 1 per each
public static void editDistance() {
String s1 = "casa";
String s2 = "cara";
String longer;
String shorter;
if(s1.length() > s2.length()) {
longer = s1;
shorter = s2;
}else {
shorter = s1;
longer = s2;
}
int edits = 0;
for (int i = 0; i < shorter.length(); i++) {
if(shorter.charAt(i) != longer.charAt(i)) {
edits++;
}
}
edits = edits *2; //one delete, one insert you told
edits = edits + Math.abs(s1.length() - s2.length()); //if different length then add counts of added/removed chars
System.out.println("edit count: " + edits);
}
Upvotes: 0