Omkar Patil
Omkar Patil

Reputation: 23

ANTLR grammar discrepancy

grammar g;
// %doc(file = "idocfile.yml")

prog: (decl | expr)+
    ;
decl: VAR '=' '"' STR '"'
    |'%doc' '(' VAR '=' '"' STR '"' ')'
    |'%quiz' '(' VAR '=' '"' STR '"' ')' .*? '%quiz'
    ;
expr:expr '*' expr
    |expr '+' expr
    |expr '-' expr
    |expr '=' expr
    |DOC
    |STR
    ;

// tokens
VAR: [a-zA-Z0-9_]+ ;
STR: [a-zA-Z.]*;
WS: [ \t\n]+ -> skip;

Testing example:

%doc(file = "idocfile.yml")
%quiz(title= "assembler quiz")
test
%quiz

For this testing this example with the above grammar the antlr shows error:

line 2:14 mismatched input 'assembler' expecting STR
line 2:24 mismatched input 'quiz' expecting '='
line 2:28 missing '=' at '"'
line 2:29 mismatched input ')' expecting STR
line 2:36 mismatched input '%quiz' expecting '='
line 4:0 mismatched input '<EOF>' expecting '('

Upvotes: 1

Views: 55

Answers (1)

Bart Kiers
Bart Kiers

Reputation: 170308

As already mentioned in the comments: a STR should include the quotes in the lexer rule, not in a parser rule. As you have it now:

VAR: [a-zA-Z0-9_]+ ;
STR: [a-zA-Z.]*;

the input abc will always become a VAR, even when you're doing '"' STR '"' in your parser. The lexer does not "listen" to what the parser needs.

Do something like this instead:

STR: '"' ~["\r\n]* '"';

(assuming your strings cannot contain line breaks)

And if escaped quotes are to be supported, do:

STR: '"' ( ~["\r\n] | '\\' [\\"] )* '"';

And remove the quotes from your parser rule:

decl: VAR '=' STR
    |'%doc' '(' VAR '=' STR ')'
    |'%quiz' '(' VAR '=' STR ')' .*? '%quiz'
    ;

And be aware that in a parser rule .* does not match "zero or more characters" (like it does in a lexer rule), but means "zero or more tokens".

Upvotes: 2

Related Questions