Deepak
Deepak

Reputation: 2102

Java function needed for finding the longest duplicated substring in a string?

Need java function to find the longest duplicate substring in a string

For instance, if the input is “banana”,output should be "ana" and we have count the number of times it has appeared in this case it is 2.

The solution is as below

public class Test{
public static void main(String[] args){
System.out.println(findLongestSubstring("i like ike"));
System.out.println(findLongestSubstring("madam i'm adam"));
System.out.println(findLongestSubstring("When life hands you lemonade, make lemons"));
System.out.println(findLongestSubstring("banana"));
}

public static String findLongestSubstring(String value) {
    String[] strings = new String[value.length()];
    String longestSub = "";

    //strip off a character, add new string to array
    for(int i = 0; i < value.length(); i++){
        strings[i] = new String(value.substring(i));
    }

    //debug/visualization
    //before sort
    for(int i = 0; i < strings.length; i++){
        System.out.println(strings[i]);
    }

    Arrays.sort(strings);
    System.out.println();

    //debug/visualization
    //after sort
    for(int i = 0; i < strings.length; i++){
        System.out.println(strings[i]);
    }

    Vector<String> possibles = new Vector<String>();
    String temp = "";
    int curLength = 0, longestSoFar = 0;

    /*
     * now that the array is sorted compare the letters
     * of the current index to those above, continue until 
     * you no longer have a match, check length and add
     * it to the vector of possibilities
     */ 
    for(int i = 1; i < strings.length; i++){
        for(int j = 0; j < strings[i-1].length(); j++){
            if (strings[i-1].charAt(j) != strings[i].charAt(j)){
                break;
            }
            else{
                temp += strings[i-1].charAt(j);
                curLength++;
            }
        }
        //this could alleviate the need for a vector
        //since only the first and subsequent longest 
        //would be added; vector kept for simplicity
        if (curLength >= longestSoFar){
            longestSoFar = curLength;
            possibles.add(temp);
        }
        temp = "";
        curLength = 0;
    }

    System.out.println("Longest string length from possibles: " + longestSoFar);

    //iterate through the vector to find the longest one
    int max = 0;
    for(int i = 0; i < possibles.size();i++){
        //debug/visualization
        System.out.println(possibles.elementAt(i));
        if (possibles.elementAt(i).length() > max){ 
            max = possibles.elementAt(i).length();
            longestSub = possibles.elementAt(i);
        }
    }
    System.out.println();
    //concerned with whitespace up until this point
    // "lemon" not " lemon" for example
    return longestSub.trim(); 
}

}

Upvotes: 0

Views: 3202

Answers (2)

user166390
user166390

Reputation:

This is a common CS problem with a dynamic programming solution.

Edit (for lijie):

You are technically correct -- this is not the exact same problem. However this does not make the link above irrelevant and the same approach (w.r.t. dynamic programming in particular) can be used if both strings provided are the same -- only one modification needs to be made: don't consider the case along the diagonal. Or, as others have pointed out (e.g. LaGrandMere), use a suffix tree (also found in the above link).

Edit (for Deepak):

A Java implementation of the Longest Common Substring (using dynamic programming) can be found here. Note that you will need to modify it to ignore "the diagonal" (look at the Wikipedia diagram) or the longest common string will be itself!

Upvotes: 5

LaGrandMere
LaGrandMere

Reputation: 10379

In Java : Suffix Tree.

Thanks to the ones that have found how to solve it, I didn't know.

Upvotes: 1

Related Questions