Reputation: 289
I use the following Java-code to instantiate a parser generated with ANTLR.
package foo;
public class Test1 {
public static void main(String[] args) throws RecognitionException {
CharStream stream = new ANTLRStringStream("foo ");
BugLexer lexer = new BugLexer(stream);
CommonTokenStream tokenStream = new CommonTokenStream(lexer);
BugParser parser = new BugParser(tokenStream);
parser.specification();
}
}
My grammar:
grammar Bug;
options {
language = Java;
}
@header {
package foo;
}
@lexer::header {
package foo;
}
specification :
'foo' EOF
;
WS
: (' ' | '\t' | '\n' | '\r')+ {$channel = HIDDEN;}
;
SCOLON
: (~ ';')+
;
And the error I get:
line 1:0 mismatched input 'foo ' expecting 'foo'
I would expect the space in the input to be ignored, but its not.. The antlr interpreter in eclipse says its fine so I suppose my Java code is wrong somehow, but I just don't see it...
Note: If I remove the rule for SCOLON then theres not bug for the input.
Upvotes: 1
Views: 301
Reputation: 170138
ANTLR's lexer tries to match as much as possible for each token. Therefor "foo "
is being tokenized as a single SCOLON
token and not as a 'foo'
- and WS
token.
Note that your SCOLON
rule:
SCOLON
: (~ ';')+
;
suggests by its name to match just a single semi-colon, but in fact matches one ore more characters other than a semi-colon. Perhaps it should have been this instead:
SCOLON
: ';'
;
?
Heinrich Ody wrote:
I somehow thought there is a priority (given by order of declaration) on which token ANTLR attempts to match the input. Thanks for your response.
That is correct: whenever two (or more) rules match the same amount of characters, the rule defined first will "win". But if a rule defined last matches the most characters, it "wins".
Upvotes: 2