Reputation: 41
How do I make a recursive descent LL(1) parser (I'm using C++
) for the following grammar:
S -> SS
S -> (S)
S -> epsilon
I've made an attempt using recursive functions, but I think it's completely wrong. I know that the rules need to be implemented in order, but how do I get the first rule to check for other instances of the first rule without going on an infinite loop? In my code I have it only check for subsequent rules.
My code:
bool rule1(tokenizer * tp){ // S -> SS
return rule2(tp) && rule2(tp);
}
bool rule2(tokenizer * tp){ // S -> (S)
bool returnVal = false;
if(tp-> lookahead. front( ). type == tkn_LPAR){
tokenizer tmpTknzr = &tp;
tp-> lookahead. pop_front( );
if(tp-> lookahead. front( ). type == tkn_RPAR){ // S -> epsilon
tp-> lookahead. pop_front( );
returnVal = true;
} else {
if(evalS(tp) && tp-> lookahead. front( ). type == tkn_RPAR){
tp-> lookahead. pop_front( );
returnVal = true;
}
}
}
return returnVal;
}
bool evalS(tokenizer * tp){ // evaluate S
return rule1(tp) || rule2(tp);
}
Upvotes: 1
Views: 769
Reputation: 47954
The 1 in LL(1) means that you must be able to select the current rule to expand by looking ahead at no more than 1 token.
With the grammar as you've written it, you cannot determine which rule to expand when you look at the first token. If it's a '('
, then you don't know if it's rule 2 or rule 1 where the first S expanded to an epsilon. That's why you can get stuck in infinite recursion.
The trick is to transform your grammar into a form where each rule that can apply to non-terminal X begins with a distinct terminal. Chris Dodd's answer showed one way to do that for your grammar.
Upvotes: 1
Reputation: 126193
The problem is that your grammar is not LL(1) -- in fact it is ambiguous, which makes parsing much harder. Try using an LL(1) grammar for parenthesis, like
S -> (S)S
S -> ε
Upvotes: 2