Reputation: 2102
Need java function to find the longest duplicate substring in a string
For instance, if the input is “banana”,output should be "ana" and we have count the number of times it has appeared in this case it is 2.
The solution is as below
public class Test{
public static void main(String[] args){
System.out.println(findLongestSubstring("i like ike"));
System.out.println(findLongestSubstring("madam i'm adam"));
System.out.println(findLongestSubstring("When life hands you lemonade, make lemons"));
System.out.println(findLongestSubstring("banana"));
}
public static String findLongestSubstring(String value) {
String[] strings = new String[value.length()];
String longestSub = "";
//strip off a character, add new string to array
for(int i = 0; i < value.length(); i++){
strings[i] = new String(value.substring(i));
}
//debug/visualization
//before sort
for(int i = 0; i < strings.length; i++){
System.out.println(strings[i]);
}
Arrays.sort(strings);
System.out.println();
//debug/visualization
//after sort
for(int i = 0; i < strings.length; i++){
System.out.println(strings[i]);
}
Vector<String> possibles = new Vector<String>();
String temp = "";
int curLength = 0, longestSoFar = 0;
/*
* now that the array is sorted compare the letters
* of the current index to those above, continue until
* you no longer have a match, check length and add
* it to the vector of possibilities
*/
for(int i = 1; i < strings.length; i++){
for(int j = 0; j < strings[i-1].length(); j++){
if (strings[i-1].charAt(j) != strings[i].charAt(j)){
break;
}
else{
temp += strings[i-1].charAt(j);
curLength++;
}
}
//this could alleviate the need for a vector
//since only the first and subsequent longest
//would be added; vector kept for simplicity
if (curLength >= longestSoFar){
longestSoFar = curLength;
possibles.add(temp);
}
temp = "";
curLength = 0;
}
System.out.println("Longest string length from possibles: " + longestSoFar);
//iterate through the vector to find the longest one
int max = 0;
for(int i = 0; i < possibles.size();i++){
//debug/visualization
System.out.println(possibles.elementAt(i));
if (possibles.elementAt(i).length() > max){
max = possibles.elementAt(i).length();
longestSub = possibles.elementAt(i);
}
}
System.out.println();
//concerned with whitespace up until this point
// "lemon" not " lemon" for example
return longestSub.trim();
}
}
Upvotes: 0
Views: 3202
Reputation:
This is a common CS problem with a dynamic programming solution.
Edit (for lijie):
You are technically correct -- this is not the exact same problem. However this does not make the link above irrelevant and the same approach (w.r.t. dynamic programming in particular) can be used if both strings provided are the same -- only one modification needs to be made: don't consider the case along the diagonal. Or, as others have pointed out (e.g. LaGrandMere), use a suffix tree (also found in the above link).
Edit (for Deepak):
A Java implementation of the Longest Common Substring (using dynamic programming) can be found here. Note that you will need to modify it to ignore "the diagonal" (look at the Wikipedia diagram) or the longest common string will be itself!
Upvotes: 5
Reputation: 10379
In Java : Suffix Tree.
Thanks to the ones that have found how to solve it, I didn't know.
Upvotes: 1