Reputation: 1955
I'm looking for an optimal solution for this problem.
Given a string s of length n
, find a prefix left-to-right equivalent to a suffix right-to-left.
The prefix and suffix can overlap.
Example: given abababa
, prefix is [ababa]ba
, suffix is ab[ababa]
.
I am able to go till this
for each i = 0 to n-1
, take the prefix ending at i and find if we have an appropriate suffix. It's time is O(n^2)
time and O(1)
space.
I came up with an optimization where we index the positions of all the characters. This way, we can eliminate set of sample spaces from 1/. Again, the worst case complexity is O(n^2)
with O(n)
additional space.
Are there any better algorithm for this ?
Upvotes: 3
Views: 6673
Reputation: 31950
Simple implementation in C#:
string S = "azffffaz";
char[] characters = S.ToCharArray();
int[] cumulativeCharMatches = new int[characters.Length];
cumulativeCharMatches[0] = 0;
int prefixIndex = 0;
int matchCount = 0;
// Use KMP type algorithm to determine matches.
// Search for the 1st character of the prefix occurring in a suffix.
// If found, assign count of '1' to the equivalent index in a 2nd array.
// Then, search for the 2nd prefix character.
// If found, assign a count of '2' to the next index in the 2nd array, and so on.
// The highest value in the 2nd array is the length of the largest suffix that's also a prefix.
for (int i = 1; i < characters.Length; i++)
{
if (characters[i] == characters[prefixIndex])
{
matchCount += 1;
prefixIndex += 1;
}
else
{
matchCount = 0;
prefixIndex = 0;
}
cumulativeCharMatches[i] = matchCount;
}
return cumulativeCharMatches.Max();
Upvotes: 3
Reputation: 4319
Make use of the KMP algorithm. The state of the algorithm determines "the longest suffix of the haystack that's still a prefix of the needle". So just take your string as needle and the string without the first character as haystack. Runs in O(N)
time and O(N)
space.
An implementation with some examples :
public static int[] create(String needle) {
int[] backFunc = new int[needle.length() + 1];
backFunc[0] = backFunc[1] = 0;
for (int i = 1; i < needle.length(); ++i) {
int testing = i - 1;
while (backFunc[testing] != testing) {
if (needle.charAt(backFunc[testing]) == needle.charAt(i-1)) {
backFunc[i] = backFunc[testing] + 1;
break;
} else {
testing = backFunc[testing];
}
}
}
return backFunc;
}
public static int find(String needle, String haystack) {
// some unused character to ensure that we always return back and never reach the end of the
// needle
needle = needle + "$";
int[] backFunc = create(needle);
System.out.println(Arrays.toString(backFunc));
int curpos = 0;
for (int i = 0; i < haystack.length(); ++i) {
while (curpos != backFunc[curpos]) {
if (haystack.charAt(i) == needle.charAt(curpos)) {
++curpos;
break;
} else {
curpos = backFunc[curpos];
}
}
if (curpos == 0 && needle.charAt(0) == haystack.charAt(i)) {
++curpos;
}
System.out.println(curpos);
}
return curpos;
}
public static void main(String[] args) {
String[] tests = {"abababa", "tsttst", "acblahac", "aaaaa"};
for (String test : tests) {
System.out.println("Length is : " + find(test, test.substring(1)));
}
}
Upvotes: 3
Reputation: 691
See:
http://algorithmsforcontests.blogspot.com/2012/08/borders-of-string.html
for an O(n) solution
The code actually calculates the index of the last character in the prefix. For the actual prefix/suffix, you will need to extract the substring from 0 to j (both included, length is j+1)
Upvotes: 0