Reputation: 21
I have a javacc grammar that defines a simple scripting language with simple expressions and conditional statements that i am reviewing and trying to correct roughly defined like this :
void Statement() : {}
{
Assignment()
|
IfStatement()
}
void Assignment() : {}
{
RealIdentifier() "=" SimpleExpression()
|
StringIdentifier() "=" StringExpression()
}
void IfStatement() : {}
{
"IF" Expression() "THEN" Block()
(
"ENDIF;"
|
"ELSE" Block() "ENDIF;"
)
}
void Expression() #void : {}
{
SimpleExpression()
(
"<" SimpleExpression() #LTNode(2)
|
">" SimpleExpression() #GTNode(2)
|
"<=" SimpleExpression() #LENode(2)
|
">=" SimpleExpression() #GENode(2)
|
"==" SimpleExpression() #EQNode(2)
|
"!=" SimpleExpression() #NENode(2)
)*
}
void SimpleExpression() #void : {}
{
Term()
(
"+" Term() #AddNode(2)
|
"-" Term() #SubsNode(2)
|
"|" Term() #OrNode(2)
)*
}
void Term() #void : {}
{
Factor()
(
"*" Factor() #MultNode(2)
|
"/" Factor() #DivNode(2)
|
"&" Factor() #AndNode(2)
)*
}
void Factor() #void : {}
{
Real()
|
RealIdentifier()
|
Function()
|
"(" Expression() ")"
|
"!" Factor() #NotNode(1)
|
StringExpression()
}
void Function() :
{
Token t;
int args = 0;
}
{
t = <FUNCTION> { jjtThis.setID(t.image, legacyCharset); } "(" args = ArgumentList() ")"
{ jjtThis.setArgs(args); }
}
int ArgumentList() #void :
{
int args = 0;
}
{
Expression() {args++;} ( "," Expression() {args++;} )*
{ return args; }
}
void StringIdentifier() :
{
Token t;
}
{
t = <STRING_IDENTIFIER>
{
System.out.println("kind="+t.kind+" image="+t.image);
}
}
void RealIdentifier() :
{
Token t;
}
{
t = <REAL_IDENTIFIER>
{
System.out.println("kind="+t.kind+" image="+t.image);
}
}
The first obvious problem is in the way Expression is defined and since it is used to define the IfStatement, i can easily end up with something like this : If (variable1 < variable2 >= variable3 )
I am trying to correct that by separating the logic of conditional expressions from that of expressions in general like this :
void IfStatement() : {}
{
"IF" ConditionalExpression() "THEN" Block()
(
"ENDIF;"
|
"ELSE" Block() "ENDIF;"
)
}
void ConditionalExpression() #void : {}
{
SimpleExpression()
(
"<" #LTNode(2)
|
">" #GTNode(2)
|
"<=" #LENode(2)
|
">=" #GENode(2)
|
"==" #EQNode(2)
|
"!=" #NENode(2)
)SimpleExpression()
}
void Expression() #void : {}
( SimpleExpression() )*
}
when compiling the generated jj file i got the following warning : Warning: Choice conflict in (...)* construct at line 210, column 3. Expansion nested within construct and expansion following construct have common prefixes, one of which is: "+" Consider using a lookahead of 2 or more for nested expansion.
The error line number is for a line in the generated jj file.I assumed the conflict is when encoutering a SimpleExpression since it won't be able to figure out if what's being parsed is a ConditionalExpression or an Expression so i tried with :
void Expression() #void : {}
{
( LOOKAHEAD(2) SimpleExpression() )*
}
and then
void ConditionalExpression() #void : {}
{
( LOOKAHEAD(2)
SimpleExpression()
(
but it didn't go away. the line in the jj file where it says there's a choice conflict is
void Statement() : {/*@bgen(jjtree) Statement */
ASTStatement jjtn000 = new ASTStatement(JJTSTATEMENT);
boolean jjtc000 = true;
jjtree.openNodeScope(jjtn000);
/*@egen*/} // <-------------------------------------- line 210
{/*@bgen(jjtree) Statement */
try {
/*@egen*/
Assignment()
|
the other problem is that operator precedence is somehow screwed up, something like IF ( "a" == "a" | "c"=="c" ) results in the | being interpreted before the second == operator using the "c" as its second operand and that gives a ClassCastException, I concluded that fixing this would require a rewrite of the whole grammar, so i thought of maybe forcing parenthesis around single conditions of a composite conditional statement like this if ( ("a" == "a") | ( "c" == "c" ) ) I am just unable to figure out how to do it.
Upvotes: 2
Views: 2389
Reputation: 16231
For your second problem
the other problem is that operator precedence is somehow screwed up, something like IF ( "a" == "a" | "c"=="c" ) results in the | being interpreted before the second == operator using the "c" as its second operand and that gives a ClassCastException,
Since you didn't post the grammar rules relevant to the |
operator, it's hard to diagnose. The solution you propose should not be required.
I hope you don't mind a comment on the language design: Trying to enforce type correctness using the grammar is usually a poor choice.
Upvotes: 0
Reputation: 170158
Instead of the kleen-star, *
, use ?
to make the right hand side (incl. the operator) of your relational expression optional so that a single SimpleExpression()
would also match:
void Expression() #void : {}
{
SimpleExpression()
( "<" SimpleExpression() #LTNode(2)
| ">" SimpleExpression() #GTNode(2)
| "<=" SimpleExpression() #LENode(2)
| ">=" SimpleExpression() #GENode(2)
| "==" SimpleExpression() #EQNode(2)
| "!=" SimpleExpression() #NENode(2)
)?
}
This should not produce any conflicts, AFAIK.
Upvotes: 2