user9039337
user9039337

Reputation:

Edit Distance Java

I wrote this algorithm to calculate the sum of the number of deletion and insert (so, of the edits) to make the first string equals to the second one. But it doesn't work.

public static int distance (String s1, String s2) {
    return distance(s1, s2, 0, 0);
}

private static int distance(String s1, String s2, int i, int j) {
    if (i == s1.length) return j;
    if (j == s2.length) return i;
    if (s1.charAt(i) == s2.charAt(j))
        return distance(s1, s2, i + 1, j + 1);
    int rep = distance(s1, s2, i + 1, j + 1) + 1;
    int del = distance(s1, s2, i, j + 1) + 1;
    int ins = distance(s1, s2, i + 1, j) + 1;
    return Math.min(del, Math.min(ins, rep));
}

EDIT: Example String 1: "casa" String 2: "cara" edit_distance=2 (1 deletion + 1 insert)

EDIT2: These are the strings that work: String 1: "casa", String 2: "cassa", edit_distance=1; String 1:"pioppo", String 2: "pioppo", edit_distance=0;

These are the one which doesn't work: String 1: "casa", String 2: "cara", edit_distance=2; (in my code=0) String 1: "tassa", String 2: "passato", edit_distance=4; (in my code=2)

Upvotes: 0

Views: 2875

Answers (5)

Polo
Polo

Reputation: 1

It works with me:

private static int distance(String s1, String s2, int i, int j) {
    if (i == s1.length() && j == s2.length()) {
        return 0;
    } else if (i == s1.length()) {
        return s2.length() - j;
    } else if (j == s2.length()) {
        return s1.length() - i;
    }

    if (s1.charAt(i) == s2.charAt(j)) {
        return distance(s1, s2, i + 1, j + 1);
    }

    // int rep = distance(s1, s2, i + 1, j + 1) + 1;
    int del = distance(s1, s2, i, j + 1) + 1;
    int ins = distance(s1, s2, i + 1, j) + 1;
    //  return Math.min(del, Math.min(ins, rep));
    return Math.min(del, ins);
}

There is the test and it works too:

/**
 * Test of distanceRec method, of class EditDistance.
 */
@Test
public void testDistanceRec() {
    System.out.println("distanceRec");
    String s1 = "passato";
    String s2 = "tassa";
    int expResult = 4;
    int result = EditDistance.distanceRec(s1, s2);
    assertEquals(expResult, result);
    // Review the generated test code and remove the default call to fail.
    //fail("The test case is a prototype.");
}

In this app you can only use two operations: insert and delete, no other operatios like replace or match. Text of exercise:

Assume that the operations available are only two: deletion and insertion of a character. Examples: - "casa" and "cassa" have edit distance equal to 1 (1 cancellation); - "casa" and "cara" have edit distance equal to 2 (1 cancellation + 1 insertion); - "tax" and "past" have edit distance equal to 4 (3 cancellations + 1 insertion); - "poplar" and "poplar" have edit distance of 0.

Upvotes: 0

gil.fernandes
gil.fernandes

Reputation: 14621

I think that the implementation is almost correct and you missed the stop conditions. They should be:

if (j == s2.length()) {
    return s1.length() - i;
}
if (i == s1.length()) {
    return s2.length() - j;
}

So the full implementation should be:

private static int distance(String s1, String s2, int i, int j) {
    if (j == s2.length()) {
        return s1.length() - i;
    }
    if (i == s1.length()) {
        return s2.length() - j;
    }
    if (s1.charAt(i) == s2.charAt(j))
        return distance(s1, s2, i + 1, j + 1);
    int rep = distance(s1, s2, i + 1, j + 1) + 2; // since Jim Belushi considers replacement to be worth 2.
    int del = distance(s1, s2, i, j + 1) + 1;
    int ins = distance(s1, s2, i + 1, j) + 1;
    return Math.min(del, Math.min(ins, rep));
}

Update

Here is the result for "tassa" and "passato":

Code:

private static int distance(String s1, String s2, int i, int j) {
    if (j == s2.length()) {
        return s1.length() - i;
    }
    if (i == s1.length()) {
        return s2.length() - j;
    }
    if (s1.charAt(i) == s2.charAt(j))
        return distance(s1, s2, i + 1, j + 1);
    int rep = distance(s1, s2, i + 1, j + 1) + 2;
    int del = distance(s1, s2, i, j + 1) + 1;
    int ins = distance(s1, s2, i + 1, j) + 1;
    return Math.min(del, Math.min(ins, rep));
}

public static void main(String[] args) {
    int dist = distance("tassa", "passato", 0, 0);
    System.out.println(dist);
}

If you run this you get:

4

Upvotes: 1

Mark
Mark

Reputation: 1498

Two easy changes and your code works:

First:

    if (i == s1.length()) return s2.length() - j;
    if (j == s2.length()) return s1.length() - i;

instead of

    if (i == s1.length()) return j;
    if (j == s2.length()) return i;

Next:

    int rep = distance(s1, s2, i + 1, j + 1) + 2;

The 2 at the end is important here. If rep means replace, it is a remove AND an insert. Making it two operations, not 1.

Upvotes: 0

shahaf
shahaf

Reputation: 4983

you need to specify how to continue when you get to the end of one string but not the other, try this

public static void main(String[] args) {
    System.out.println(distance("casa","cassa"));
}

public static int distance (String s1, String s2) {
    return distance(s1, s2, 0, 0);
}

private static int distance(String s1, String s2, int i, int j) {
    if (i == s1.length() && j==s2.length())
        return 0;
    else if(i== s1.length())
        return s2.length() - j;
    else if(j == s2.length())
        return s1.length() - i;

    if (s1.charAt(i) == s2.charAt(j))
        return distance(s1, s2, i + 1, j + 1);

    int rep = distance(s1, s2, i + 1, j + 1) + 1;
    int del = distance(s1, s2, i, j + 1) + 1;
    int ins = distance(s1, s2, i + 1, j) + 1;
    return Math.min(del, Math.min(ins, rep));
}

output

1

Note: the first if is not necessary, just make the code more understandable... remove it in your impl

Upvotes: 0

xxxvodnikxxx
xxxvodnikxxx

Reputation: 1277

That should to be what you want

If every edit of char means distance+2 (= delete + add), it also adds count of characters added/removed- but only +1, not +2

//get number of deletions / edits - inc 1 per each
public static void editDistance() {
    String s1 = "casa";
    String s2 = "cara";

    String longer;
    String shorter;
    if(s1.length() > s2.length()) {
        longer = s1;
        shorter = s2;
    }else {
        shorter = s1;
        longer = s2;
    }

    int edits = 0;
    for (int i = 0; i < shorter.length(); i++) {
        if(shorter.charAt(i) != longer.charAt(i)) {
            edits++;
        }
    }

    edits = edits *2; //one delete, one insert you told
    edits = edits + Math.abs(s1.length() - s2.length()); //if different length then add counts of added/removed chars 

    System.out.println("edit count: " + edits);

}

Upvotes: 0

Related Questions