Reputation: 67
I'm trying to create my own analyser/parser.
I have a problem which I understand why it doesn't work but I'm unsure of how to solve it.
This is the code for the problem part of my parser.
void Expression() : {}{
Term() ((<PLUS> | <MINUS>) Term())*
}
void Term() : {}{
Factor()((<MULTIPLY> | <DIVIDE>) Factor())*
}
void Factor() : {}{
(<ID> | <NUMBER> | ((<PLUS> | <MINUS>)?<OPEN_PARENTHESIS> Expression() <CLOSE_PARENTHESIS>))
}
void Condition() : {}{
(
(<NOT> Condition()) |
(<OPEN_PARENTHESIS> Condition() (<AND> | <OR>) Condition() <CLOSE_PARENTHESIS>) |
(Expression() (<EQUAL_CHECK> | <NOT_EQUAL> | <LESS> | <LESS_EQUAL> | <GREATER> | <GREATER_EQUAL>) Expression())
)
}
As you can see, the problem comes within the Condition() method from the last two of the three options in the OR section. This is because Expression() can eventually become "( Expression() )" therefore both the third and second option can begin with a open parenthesis token.
However, I'm unsure how I would solve this problem. I solved a similar problem earlier in the parser however I can't employ the same logic here without it being extremely messy because of the way Expression() --> Term() --> Factor() and the problem code being all the way down in the Factor() method.
Any advice would be greatly appreciated.
Thanks,
Thomas.
EDIT:
For more info, I'll provide to code examples that should work with this parser but will not due to the bug explained above.
fun succesful_method()
start
var i = 1;
if(i > 0 and i < 2)
do
i = 2;
stop
stop
start
successful_method()
stop
The above method would run successfully as it uses the second alternative of the Condition() method.
fun succesful_method()
start
var i = 1;
if(i > 0)
do
i = 2;
stop
stop
start
successful_method()
stop
The above method would fail, as it requires use of the third alternative, however it cannot access this due to the '(' causing the parser to call the second alternative.
Upvotes: 3
Views: 1070
Reputation: 16241
You can solve this with syntactic look ahead.
void CompOp() : {} { <EQUAL_CHECK> | <NOT_EQUAL> | <LESS> | <LESS_EQUAL> | <GREATER> | <GREATER_EQUAL> }
void Condition() : {}{
<NOT> Condition()
|
LOOKAHEAD(Expression() CompOp())
Expression()
CompOp()
Expression()
|
<OPEN_PARENTHESIS>
Condition()
(<AND> | <OR>)
Condition()
<CLOSE_PARENTHESIS>
}
Slightly more efficient is to only lookahead when there is a (
.
void Condition() : {}{
<NOT> Condition()
| LOOKAHEAD( <OPEN_PARENTHESIS> )
(
LOOKAHEAD(Expression() CompOp())
Expression()
CompOp()
Expression()
|
<OPEN_PARENTHESIS>
Condition()
(<AND> | <OR>)
Condition()
<CLOSE_PARENTHESIS>
)
|
Expression()
CompOp()
Expression()
}
Upvotes: 1
Reputation: 1848
Using a single grammar for all expressions and defining precedence for all operators should solve your problem, at the expense of adding semantic checks for the type of expressions.
Expr -> AndExpr (<OR> AndExpr)*
AndExpr -> NotExpr (<AND> NotExpr)*
NotExpr -> <NOT>* RelExpr
RelExpr -> NumExpr () (<RELOP> NumExpr)?
NumExpr -> Term ((<PLUS>|<MINUS>) Term)*
Term -> Factor ((<MULTIPLY>|<DIVIDE>) Factor)*
Factor -> (<PLUS>|<MINUS>)* Atom
Atom -> <ID> | <NUMBER> | <OPEN_PARENTHESIS> Expr <CLOSE_PARENTHESIS>
The token <RELOP>
represents your relational operators.
Note that this grammar let's you mix boolean and numerical expressions, so you should check for errors.
For example for Expr -> AndExpr
the type returned would be the type of AndExpr. But for AndExpr <OR> AndExpr
you should check that both AndExprs are boolean expressions and the type returned by Expr would be Boolean.
Upvotes: 1