Reputation: 4212
I recently interviewed with a company for software engineering position. I was asked the question of longest unique sub-string in a string. My algorithms was as follows -
Start from the left-most character, and keep storing the character in a hash table with the key as the character
and the value as the index_where_it_last_occurred
. Add the character to the answer string as long as its not present in the hash table. If we encounter a stored character again, I stop and note down the length. I empty the hash table and then start again from the right index of the repeated character. The right index is retrieved from the (index_where_it_last_occurred) flag. If I ever reach the end of the string, I stop and return the longest length.
For example, say the string was, abcdecfg
.
I start with a
, store in hash table. I store b
and so on till e
. Their indexes are stored as well. When I encounter c
again, I stop since it's already hashed and note down the length which is 5. I empty the hash table, and start again from the right index of the repeated character. The repeated character being c
, I start again from the position 3 ie., the character d
. I keep doing this while I don't reach the end of string.
I am interested in knowing what the time complexity of this algorithm will be. IMO, it'll be O(n^2).
This is the code.
import java.util.*;
public class longest
{
static int longest_length = -1;
public static void main(String[] args)
{
Scanner in = new Scanner(System.in);
String str = in.nextLine();
calc(str,0);
System.out.println(longest_length);
}
public static void calc(String str,int index)
{
if(index >= str.length()) return;
int temp_length = 0;
LinkedHashMap<Character,Integer> map = new LinkedHashMap<Character,Integer>();
for (int i = index; i<str.length();i++)
{
if(!map.containsKey(str.charAt(i)))
{
map.put(str.charAt(i),i);
++temp_length;
}
else if(map.containsKey(str.charAt(i)))
{
if(longest_length < temp_length)
{
longest_length = temp_length;
}
int last_index = map.get(str.charAt(i));
// System.out.println(last_index);
calc(str,last_index+1);
break;
}
}
if(longest_length < temp_length)
longest_length = temp_length;
}
}
Upvotes: 0
Views: 116
Reputation: 58211
If the alphabet is of size K, then when you restart counting you jump back at most K-1 places, so you read each character of the string at most K times. So the algorithm is O(nK).
The input string which contains n/K copies of the alphabet exhibits this worst-case behavior. For example if the alphabet is {a, b, c}, strings of the form "abcabcabc...abc" have the property that nearly every character is read 3 times by your algorithm.
You can solve the original problem in O(K+n) time, using O(K) storage space by using dynamic programming.
Let the string be s
, and we'll keep a number M which will be the the length of maximum unique_char string ending at i, P, which stores where each character was previously seen, and best
, the longest unique-char string found so far.
Start:
Set P[c] = -1 for each c in the alphabet.
M = 0
best = 0
Then, for each i:
M = min(M+1, i-P[s[i]])
best = max(best, M)
P[s[i]] = i
This is trivially O(K) in storage, and O(K+n) in running time.
Upvotes: 1