Reputation: 13267
Find the length of the longest prefix string for all the suffixes of the string.
For example suffixes of the string ababaa
are ababaa
, babaa
, abaa
, baa
, aa
and a
. The similarities of each of these strings with the string "ababaa" are 6,0,3,0,1,1 respectively. Thus the answer is 6 + 0 + 3 + 0 + 1 + 1 = 11.
I wrote following code
#include <iostream>
#include <string.h>
#include <stdio.h>
#include <time.h>
int main ( int argc, char **argv) {
size_t T;
std::cin >> T;
char input[100000];
for ( register size_t i = 0; i < T; ++i) {
std::cin >> input;
double t = clock();
size_t len = strlen(input);
char *left = input;
char *right = input + len - 1;
long long sol = 0;
int end_count = 1;
while ( left < right ) {
if ( *right != '\0') {
if ( *left++ == *right++ ) {
sol++;
continue;
}
}
end_count++;
left = input; // reset the left pointer
right = input + len - end_count; // set right to one left.
}
std::cout << sol + len << std::endl;
printf("time= %.3fs\n", (clock() - t) / (double)(CLOCKS_PER_SEC));
}
}
Working fine, but for a string which is 100000
long and having same character i.e. aaaaaaaaaa.......a
, it is taking long time , how can i optimize this one more.
Upvotes: 0
Views: 5142
Reputation: 91
Use Z
algorithm to calculate length of all substrings, which also prefixes in O(n)
and then scan resulting array and sum its values.
Reference: https://www.geeksforgeeks.org/sum-of-similarities-of-string-with-all-of-its-suffixes/
Upvotes: 1
Reputation: 201
Here a java implementation:
// sprefix
String s = "abababa";
Vector<Integer>[] v = new Vector[s.length()];
int sPrefix = s.length();
v[0] = new Vector<Integer>();
v[0].add(new Integer(0));
for(int j = 1; j < s.length(); j++)
{
v[j] = new Vector<Integer>();
v[j].add(new Integer(0));
for(int k = 0; k < v[j - 1].size(); k++)
if(s.charAt(j) == s.charAt(v[j - 1].get(k)))
{
v[j].add(v[j - 1].get(k) + 1);
v[j - 1].set(k, 0);
}
}
for(int j = 0; j < v.length; j++)
for(int k = 0; k < v[j].size(); k++)
sPrefix += v[j].get(k);
System.out.println("Result = " + sPrefix);
Upvotes: 0
Reputation: 18268
Let's say your ababaa
is a pattern P.
I think you could use the following algorithm:
Upvotes: 1
Reputation: 9424
I'm not sure whether a Trie gives you much performance gain.. but I would certainly think about it.
The other idea I had is to try to compress your string. I didn't really think about it, just a crazy idea...
if you have a string like this: ababaa
compress it maybe to: abab2a
. Then you have to come up with a technique where you can use your algorithm with those strings. The advantage is you can then compare long strings 100000a
efficiently with each other. Or more importantly: you can calculate your sum very fast.
But again, I didn't think it through, maybe this is a very bad idea ;)
Upvotes: 0
Reputation: 22124
From what I see, you are using plain array to evaluate the suffix and though it may turn out to be efficient for some data set, it would fail to be efficient for some cases, such as the one you mentioned.
You would need to implement a Prefix-Tree or Trie like Data Structure. The code for those aren't straightforward, so if you are not familiar with them, I would suggest you read a little bit about them.
Upvotes: 0