Reputation: 3209
I have an ANTLR 4 lexer grammar with a BEGIN lexer rule and an ID lexer rule:
lexer grammar Begin;
BEGIN : 'begin' ;
ID : [a-z]+ ;
WS : [ \t\r\n]+ -> skip ;
After generating the lexer and compiling, I ran the ANTLR TestRig tool with input 'begin'
:
grun Begin tokens -tokens
begin
^Z
I got this output:
[@0,0:4='begin',<1>,1:0]
[@1,7:6='<EOF>',<-1>,2:0]
Notice the token type is 1 (as <1> indicates).
I ran it again, this time with input 'beginning'
:
grun Begin tokens -tokens
beginning
^Z
I got this output:
[@0,0:8='beginning',<1>,1:0]
[@1,11:10='<EOF>',<-1>,2:0]
Why do I get the same token type? Does that mean the lexer is using the same lexer rule for both inputs?
How do I get TestRig to show me that the lexer uses this rule: BEGIN : 'begin' ;
for tokenizing this input: begin
and this rule: ID : [a-z]+ ;
for tokenizing this input: beginning
Upvotes: 0
Views: 1418
Reputation: 644
I used the following test setup:
grammar Begin;
test: (BEGIN | ID)+;
BEGIN : 'begin' ;
ID : [a-z]+ ;
WS : [ \t\r\n]+ -> skip ;
with ANTLRWorks 2.1. It works as expected:
with 'begin':
Arguments: [Begin, test, -tokens, -tree, -gui, C:\ANTLR\Begin.txt]
[@0,0:4='begin',<1>,1:0]
[@1,5:4='<EOF>',<-1>,1:5]
(test begin)
with 'beginning':
Arguments: [Begin, test, -tokens, -tree, -gui, C:\ANTLR\Begin.txt]
[@0,0:8='beginning',<2>,1:0]
[@1,9:8='<EOF>',<-1>,1:9]
(test beginning)
Upvotes: 1