Reputation: 3148
fellow ANTLR experts, could you please explain me why this warning appears in ANTLWorks? How to understand this message and how to get rid of it in this particular case?
Example of valid input: abc "xyz def" abc should be recognized as keywordExpr token and "xyz def" as phraseExpr.
[14:32:24] warning(200): TestExpr.g:12:4: Decision can match input such as "CHAR" using multiple alternatives: 1, 2
As a result, alternative(s) 2 were disabled for that input [14:32:24] warning(200): /Users/imochurad/Development/antlr3/Grammars/TestExpr.g:12:4: Decision can match input such as "CHAR" using multiple alternatives: 1, 2
As a result, alternative(s) 2 were disabled for that input
grammar TestExpr;
options {
output=AST;
ASTLabelType=CommonTree;
}
expr
: kpExpr*;
kpExpr : keywordExpr|phraseExpr;
keywordExpr
: CHAR+;
phraseExpr
: '"' CHAR+ (' ' CHAR+)* '"';
CHAR : ('A'..'Z') | ('a'..'z');
INT : '0'..'9'+;
NEWLINE : '\r'? '\n';
WS : (' '|'\t'|'\n'|'\r')+ {skip();};
Thanks a lot!
Upvotes: 0
Views: 259
Reputation: 3598
As it is written, you grammar is ambiguous when parsing unquoted strings. abc could be parsed as one keywordExpr
(abc) or three (a, b and c) or even two. I think you expect keywords to be separated by whitespace. However, since you are skipping whitespace in the lexer, the parser cannot tell the difference between abc
and a b c
.
I suspect that keywordExpr
and phraseExpr
should be lexer rules:
KeywordExpr: CHAR+;
PhraseExpr: '"' CHAR+ (' ' CHAR+)* '"';
CHAR
should probably also become a fragment, to avoid accidentally generated a CHAR
token when you have a single letter keyword.
With that change, abc is unambiguous in the lexer, since the lexer will use the longest possible match.
Regarding having spaces be treated differently, this works best if it is done in the lexer. The above rule for PhraseExpr will correctly handle a space, since when the lexer encounters it, it cannot match the WS
rule. Handling it in the parser is much more complicated.
Upvotes: 1