Reputation: 31
Given a string and array of strings find the longest suffix of string in array.
for example
string = google.com.tr
array = tr, nic.tr, gov.nic.tr, org.tr, com.tr
returns com.tr
I have tried to use binary search with specific comparator, but failed.
C-code would be welcome.
Edit:
I should have said that im looking for a solution where i can do as much work as i can in preparation step (when i only have a array of suffixes, and i can sort it in every way possible, build any data-structure around it etc..), and than for given string find its suffix in this array as fast as possible. Also i know that i can build a trie out of this array, and probably this will give me best performance possible, BUT im very lazy and keeping a trie in raw C in huge peace of tangled enterprise code is no fun at all. So some binsearch-like approach will be very welcome.
Upvotes: 2
Views: 1197
Reputation: 3872
Why don't you use suffix arrays ? It works when you have large number of suffixes.
Complexity, O(n(logn)^2)
, there are O(nlogn)
versions too.
Implementation in c here. You can also try googling suffix arrays.
Upvotes: 0
Reputation: 15399
Assuming constant time addressing of characters within strings this problem is isomorphic to finding the largest prefix.
Let i = 0
.
Let S = null
Let c = prefix[i]
Remove strings a
from A
if a[i] != c
and if A
. Replace S
with a
if a.Length == i + 1
.
Increment i
.
Go to step 3.
Is that what you're looking for?
Example:
prefix = rt.moc.elgoog
array = rt.moc, rt.org, rt.cin.vof, rt.cin, rt
Pass 0: prefix[0]
is 'r'
and array[j][0] == 'r'
for all j
so nothing is removed from the array. i + 1 -> 0 + 1 -> 1
is our target length, but none of the strings have a length of 1, so S
remains null
.
Pass 1: prefix[1]
is 't'
and array[j][1] == 'r'
for all j
so nothing is removed from the array. However there is a string that has length 2, so S
becomes rt
.
Pass 2: prefix[2]
is '.'
and array[j][2] == '.'
for the remaining strings so nothing changes.
Pass 3: prefix[3]
is 'm'
and array[j][3] != 'm'
for rt.org
, rt.cin.vof
, and rt.cin
so those strings are removed.
etc.
Upvotes: 1
Reputation: 2051
If your array of strings is something along the following:
char string[STRINGS][MAX_STRING_LENGTH];
string[0]="google.com.tr";
string[1]="nic.tr";
etc, then you can simply do this:
int x, max = 0;
for (x = 0; x < STRINGS; x++) {
if (strlen(string[x]) > max) {
max = strlen(string[x]);
}
}
x = 0;
while(true) {
if (string[max][x] == ".") {
GOTO out;
}
x++;
}
out:
char output[MAX_STRING_LENGTH];
int y = 0;
while (string[max][x] != NULL) {
output[y++] = string[++x];
}
(The above code may not actually work (errors, etc.), but you should get the general idea.
Upvotes: 0
Reputation: 10585
Naive, pseudo-answer:
Alternatively, you could skip the sorting and just iterate, assigning the biggestString
if the currentString
is bigger than the biggestString
that has matched.
Edit 0:
Maybe you could improve this by looking at your array before hand and considering "minimal" elements that need to be checked.
For instance, if .com
appears in 20 members you could just check .com
against the given string to potentially eliminate 20 candidates.
Edit 1:
On second thought, in order to compare elements in the array you will need to use a string comparison. My feeling is that any gain you get out of an attempt at optimizing the list of strings for comparison might be negated by the expense of comparing them before doing so, if that makes sense. Would appreciate if a CS type could correct me here...
Upvotes: 0
Reputation: 7917
Another naïve, pseudo-answer.
Set boolean "found" to false. While "found" is false, iterate over the array comparing the source string to the strings in the array. If there's a match, set "found" to true and break. If there's no match, use something like strchr()
to get to the segment of the string following the first period. Iterate over the array again. Continue until there's a match, or until the last segment of the source string has been compared to all the strings in the array and failed to match.
Not very efficient....
Upvotes: 0