anand
anand

Reputation: 11309

Matching sequences in string

Given three string sequences "abb" "ab" "a".

Now I need to find algorithm to check if a string can be parsed by above sequences .Example

string "abbabab" can be parsed by sequences "abb" "ab" and "ab"

String "abbbaaa" It cannot be parsed as we dont have "b" sequence.

I have written following code, but I feel that its not right algo.Any suggestions.

bool checkIsStringCanBeParsed(std::string S) {

    std::string seqArray[6]={"abb","ab","a","abb","ab","a"};

    int index=0;
    int lastIndex=0;
    for(int i=0;i<6;++i)
    {
        lastIndex =index;
        for(int idx=0;idx<seqArray[i].length();++idx)
        {

            if(index >= S.length())
            {
                index =lastIndex;
                break;
            }
            if(S[index] == seqArray[i][idx])
            {

                ++index;
                continue;
            }
            else
                break;


        }
        if(index == S.length())
            return true;

    }
    return false;
}

Upvotes: 0

Views: 358

Answers (3)

amaurs
amaurs

Reputation: 1632

What you are trying to do is to build a regular expression engine that accepts sentences from the expression (abb|ab|a)*, an option is to use a non deterministic automata to represent that regular expression. Using this tool I was able to generate:

nda for (abb|ab|b)

Here we have a graph with 3 states. When you want to see if a given string is accepted by your rules then it must be accepted by this graph, by accepted it means that reading the string char by char you should be able to navigate through the graph always using valid steps. When a string is parsed you should always start at state 0.

For example string "aaba" will lead us to state 0, state 1, state 1, state 2, state 1, so the string is valid because we where able to parse it completely. The string "abbb" will lead us to state 0, state 1, state 2, state 3 but there is no way to go from state 3 using another 'b' so this string is not valid.

Pseudocode to do this:

boolean accept(word, state)
{
    if(word.length == 0) //either the string is empty or the parsing has ended succesfully
    {
        return true;
    }
    else
    {
        parsingChar = word[0] //first char of string

        switch state
            case 0:
                if (parsingChar == 'a')
                    return accept(substring(word,1),1); //recursive call, we remove the first char and move to state 1
                else
                    return false; // the first char is not an 'a' the word is not accepted
                break;
            case 1:
                if (parsingChar == 'a')
                    return accept(substring(word,1),3); // move to state 3
                else if (parsingChar == 'b')
                    return accept(substring(word,1),2); // move to state 2
                else
                    return false; //
                break;
            case 2:
                if (parsingChar == 'a')
                    return accept(substring(word,1),3); // move to state 3
                else if (parsingChar == 'b')
                    return accept(substring(word,1),1); // move to state 1
                else
                    return false; //
                break;
            case 3:
                if (parsingChar == 'a')
                    return accept(substring(word,1),1); // move to state 1
                else
                    return false;
                break;
    }

}

Upvotes: 1

kraskevich
kraskevich

Reputation: 18546

Dynamic programming should work fine. Let's say dp(i) = true if and only if it is possible to parse prefix with length i with given sequences. Initially, dp(0) = true. Then one can compute dp values for all i the following way: if dp(j) = true and substring from j+1 to i matches one of the sequences then dp(i) = true.

Upvotes: 0

MatthieuBizien
MatthieuBizien

Reputation: 1725

You should use regex. A regex for your sequences is "^((abb)|(ab)|(a))*$". The regex library will optimize it for you.

Upvotes: 0

Related Questions