Reputation: 50989
The grammar below works incorrectly.
The grammar is following:
program:
(keyword |
string |
WS)*;
keyword: 'print';
string: QUOTE (CH | WS)*? QUOTE;
QUOTE: '\'';
WS : [ \t\r\n]+;
CH: .;
The goal is to have langauge with both string literals and keywords.
The parsed string is follows:
print 'printed'
It should be parsed as keyword, then whitespace, then string literal.
It is parsed this way instead:
Obviously, it sees keyword print
inside string literal. This is because it has implicitly created parasitic rule for "print".
How to avoid/overcome this?
I don't wish to specify, that string literal can contain keywords, because it is logically incorrect.
Also I can't specify DOT lexer meta operator, because I don't wish to allow every token contained inside quotes (I don't want quote to occur there).
So, what to do?
Upvotes: 1
Views: 152
Reputation: 99859
If you separate your combined grammar into a separate lexer grammar
and parser grammar
, ANTLR will not allow you to implicitly define lexer rules via literals placed in a parser rule. If you want print
to be a keyword, you would need to include this lexer rule (otherwise 'print'
would not be allowed in a parser rule):
PRINT : 'print';
The next step is to convert string
from a parser rule to a lexer rule, such as this:
STRING : QUOTE ~'\''* QUOTE;
Upvotes: 2