Reputation: 6240
With the following (subset of a) grammer for a scripting language:
expr
...
| 'regex(' str=expr ',' re=expr ')' #regexExpr
...
an expression like regex('s', 're')
parses to the following tree which makes sense:
regexExpr
'regex('
expr: stringLiteral ('s')
','
expr: stringLiteral ('re')
')'
I'm now trying to add an option third argument to my regex function, so I've used this modified rule:
'regex(' str=expr ',' re=expr (',' n=expr )? ')'
This causes regex('s', 're', 1)
to be parsed in a way that's unexpected to me:
regexExpr
'regex('
expr:listExpression
expr: stringLiteral ('s')
','
expr: stringLiteral ('re')
','
expr: integerLiteral(1)
')'
where listExpression
is another rule defined below regexExpr
:
expr
...
| 'regex(' str=expr ',' re=expr (',' n=expr)? ')' #regexExpr
...
| left=expr ',' right=expr #listExpr
...
I think this listExpr
could have been defined better (by defining surrounding tokens), but I've got compatibility concerns with changing it now.
I don't understand the parser rule matching precedence here. Is there a way I can add the optional third arg to regex()
without causing the first two args to be parsed as a listExpr
?
Upvotes: 1
Views: 32
Reputation: 170158
Try defining them in two separate alternatives and with the same label #regexExpr
:
expr
: 'regex' '(' str=expr ',' re=expr ',' n=expr ')' #regexExpr
| 'regex' '(' str=expr ',' re=expr ')' #regexExpr
| left=expr ',' right=expr #listExpr
| ...
;
Upvotes: 1