Reputation: 33
I know that this has been discussed thousand times but I still cannot figure out why is following grammar failing. In interpreter everything works fine, without any errors or warnings. However when running the generated code, I'm getting mismatched input as shown below.
For this grammar:
grammar xxx;
options {
language = Java;
output = AST;
}
@members {
@Override
public String getErrorMessage(RecognitionException e,
String[] tokenNames)
{
List stack = getRuleInvocationStack(e, this.getClass().getName());
String msg = null;
if ( e instanceof NoViableAltException ) {
NoViableAltException nvae = (NoViableAltException)e;
msg = " no viable alt; token="+e.token+
" (decision="+nvae.decisionNumber+
" state "+nvae.stateNumber+")"+
" decision=<<"+nvae.grammarDecisionDescription+">>";
}
else {
msg = super.getErrorMessage(e, tokenNames);
}
return stack+" "+msg;
}
@Override
public String getTokenErrorDisplay(Token t) {
return t.toString();
}
}
obj
: first=subscription
(COMMA other=subscription)*
;
subscription
: ID
(EQUALS arguments_in_brackets)?
filters
;
arguments_in_brackets
: LOPAREN arguments ROPAREN
;
arguments
: (arguments_body)
;
arguments_body
: argument (arguments_more)?
;
arguments_more
: SEMICOLON arguments_body
;
argument
: id_equals argument_body
;
argument_body
: STRING
| INT
| FLOAT
;
filters
: LSPAREN expression RSPAREN
;
expression
: or
;
or
: first=and
(OR^ second=and)*
;
and : first=atom
(AND^ second=atom)*
;
atom
: filter
| atom_expression
;
atom_expression
: LCPAREN
expression
RCPAREN
;
filter
: id_equals arguments_in_brackets
;
id_equals
: WS* ID WS* EQUALS WS*
;
COMMA: WS* ',' WS*;
LCPAREN : WS* '(' WS*;
RCPAREN : WS* ')' WS*;
LSPAREN : WS* '[' WS*;
RSPAREN : WS* ']' WS*;
LOPAREN : WS* '{' WS*;
ROPAREN : WS* '}' WS*;
AND: WS* 'AND' WS*;
OR: WS* 'OR' WS*;
NOT: WS* 'NOT' WS*;
EQUALS: WS* '=' WS*;
SEMICOLON: WS* ';' WS*;
ID : ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')*
;
INT : '0'..'9'+
;
FLOAT
: ('0'..'9')+ '.' ('0'..'9')* EXPONENT?
| '.' ('0'..'9')+ EXPONENT?
| ('0'..'9')+ EXPONENT
;
// : '"' ( ESC_SEQ | ~('\\'|'"') )* '"'
// : '"' (~'"')* '"'
STRING
: '"' (~'"')* '"'
;
fragment
EXPONENT : ('e'|'E') ('+'|'-')? ('0'..'9')+ ;
fragment
HEX_DIGIT : ('0'..'9'|'a'..'f'|'A'..'F') ;
fragment
ESC_SEQ
: '\\' ('b'|'t'|'n'|'f'|'r'|'\"'|'\''|'\\')
| UNICODE_ESC
| OCTAL_ESC
;
fragment
OCTAL_ESC
: '\\' ('0'..'3') ('0'..'7') ('0'..'7')
| '\\' ('0'..'7') ('0'..'7')
| '\\' ('0'..'7')
;
fragment
UNICODE_ESC
: '\\' 'u' HEX_DIGIT HEX_DIGIT HEX_DIGIT HEX_DIGIT
;
NEWLINE: '\r'? '\n' {skip();} ;
WS: (' '|'\t')+ {skip();} ;
And for this input:
status={name="Waiting";val=5}[ownerEmail1={email="[email protected]"} OR internalStatus={status="New"}],comments={type="fds"}[(internalStatus={status="Owned"} AND ownerEmail2={email="[email protected]"}) OR (role={type="Contributor"} AND status={status="Closed"})]
I'm getting:
line 1:67 [obj, subscription, filters, expression, or, and, atom, filter, arguments_in_brackets] mismatched input [@18,67:80='internalStatus',<11>,1:67] expecting ROPAREN
line 1:157 [obj, subscription, filters, expression, or, and, atom, atom_expression, expression, or, and, atom, filter, arguments_in_brackets] mismatched input [@42,157:167='ownerEmail2',<11>,1:157] expecting ROPAREN
Can someone give me any clues why is this failing please? I've tried to rewrite it in many ways but the error is still the same.
Upvotes: 1
Views: 345
Reputation: 170308
The problem is that you're using WS
tokens in other lexer rules and are therefor skipping these tokens. This causes the lexer to discard these tokens entirely, and can then not be used in parser rules.
So, if you have a rule like:
WS : ' ' {skip();};
and then use this rule in NOT
:
NOT : WS* 'NOT' WS*;
it causes the NOT
token to be skipped as well.
If you're already skipping these WS
chars, you don't need to include them in your other lexer rules: simply remove all WS*
in other rules:
...
NOT : 'NOT';
...
(also remove them from parser rules: all skip
ped tokens from the lexer are never available in parser rules anyway!)
Upvotes: 1