user2290820
user2290820

Reputation: 2749

Find num of overlapping and non-overlapping substrings in a string

PS: This is not a duplicate of How to find the overlap between 2 sequences, and return it

[Although I ask for solutions in above approach if it could be applied to the following problem]

Q: Although I got it right, it is still not a scalable solution and is definitely not optimized (low on score). Read the following description of the problem and kindly offer better solution.

Question:

For simplicity, we require prefixes and suffixes to be non-empty and shorter than the whole string S. A border of a string S is any string that is both a prefix and a suffix. For example, "cut" is a border of a string "cutletcut", and a string "barbararhubarb" has two borders: "b" and "barb".

class Solution { public int solution(String S); }

that, given a string S consisting of N characters, returns the length of its longest border that has at least three non-overlapping occurrences in the given string. If there is no such border in S, the function should return 0.

For example,

Assume that:

Complexity:

def solution(S):
    S = S.lower()
    presuf = []
    f = l = str()
    rank = []
    wordlen = len(S)
    for i, j in enumerate(S):
        y = -i-1
        f += S[i]
        l = S[y] + l
        if f==l and f != S:
            #print f,l
            new=S[i+1:-i-1]
            mindex = new.find(f)
            if mindex != -1:
                mid = f #new[mindex]
                #print mid
            else:
                mid = None
            presuf.append((f,mid,l,(i,y)))
    #print presuf
    for i,j,k,o in presuf:
        if o[0]<wordlen+o[-1]: #non overlapping
            if i==j:
                rank.append(len(i))
            else:
                rank.append(0)
    if len(rank)==0:
        return 0
    else:
        return max(rank)

My solutions time complexity is: O(N2) or O(N4) Help greatly appreciated.

Upvotes: 4

Views: 5839

Answers (5)

Mitesh Joshi
Mitesh Joshi

Reputation: 185

The Z-Algorithm would be a good solution.

Upvotes: 0

Binesh
Binesh

Reputation: 1

protected int calcBorder(String input) {
    if (null != input) {
        int mean = (input.length() / 3);
        while (mean >= 1) {
            if (input.substring(0, mean).equals(
                    input.substring(input.length() - mean))) {
                String reference = input.substring(0, mean);
                String temp = input
                        .substring(mean, (input.length() - mean));
                int startIndex = 0;
                int endIndex = mean;
                int count = 2;
                while (endIndex <= temp.length()) {
                    if (reference.equals(temp.substring(startIndex,
                            endIndex))) {
                        count++;
                        if (count >= 3) {
                            return reference.length();
                        }
                    }
                    startIndex++;
                    endIndex++;
                }
            }
            mean--;
        }
    }
    return 0;
}

Upvotes: 0

Cong Nguyen
Cong Nguyen

Reputation: 11

My solution is combination between Rabin-Karp and Knuth–Morris–Pratt algorithms. http://codility.com/cert/view/certB6J4FV-W89WX4ZABTDRVAG6/details

Upvotes: 1

Ivor Prebeg
Ivor Prebeg

Reputation: 978

I have a solution with suffix arrays (there actually is algorithm for constructing SA and LCP in linear time or something bit worse than that, but surely not quadratic).

Still not sure if I can go without RMQs ( O(log n) with SegmentTree) which I couldn't make pass my own cases and seems quite complicated, but with RMQs it can (not mentioning approach with for loop instead of RMQ, that would make it quadratic anyway).

Solution is performing quite fast and passing my 21 test cases with various perks I've managed to craft, but still failing on some of their cases. Not sure if that helped you or gave you idea how to approach the problem, but I am sure that naive solution, like @Vicenco said in some of his comments, can't get you better than Silver.

EDIT: managed to fix it all problems, but still to slow. I had to enforce some conditions but had to increase complexity with this, still not sure how to optimize that. Will keep you posted. Good luck!

Upvotes: 0

Vincenzo Melandri
Vincenzo Melandri

Reputation: 81

I have a (Java) solution that performs O(N) or O(N**3), for a resulting 90/100 overall, but I can't figure out how to make it go though 2 different testcases:

almost_all_same_letters aaaaa...aa??aaaa??....aaaaaaa 2.150 s. TIMEOUT ERROR running time: >2.15 sec., time limit: 1.20 sec.

same_letters_on_both_ends 2.120 s. TIMEOUT ERROR running time: >2.12 sec., time limit: 1.24 sec.

Edit: Nailed it! Now I have a solution that perform in O(N) and passes all the checks for a 100/100 result :) I didn't know Codility, but it's a nice tool!

Upvotes: 0

Related Questions