Korchkidu
Korchkidu

Reputation: 4946

Simple grammar not working

I have a simple grammar to parse files containing identifiers and keywords between brackets (hopefully):

grammar Keyword;

// PARSER RULES
//
entry_point :   ('['ID']')*;

// LEXER RULES
//
KEYWORD     :   '[Keyword]';

ID      :   ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')*;
WS      :   ( ' ' | '\t' | '\r' | '\n' | '\r\n') 
            {
                $channel = HIDDEN;
            };

It works for input:

[Hi]
[Hi]

It returns a NoViableAltException error for input:

[Hi]
[Ki]

If I comment KEYWORD, then it works fine. Also, if I change my grammar to:

grammar Keyword;

// PARSER RULES
//
entry_point :   ID*;

// LEXER RULES
//
KEYWORD     :   '[Keyword]';

ID      :   '[' ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')* ']';
WS      :   ( ' ' | '\t' | '\r' | '\n' | '\r\n') 
            {
                $channel = HIDDEN;
            };

Then it works. Could you please help me figuring out why?

Best regards.

Upvotes: 1

Views: 69

Answers (1)

Bart Kiers
Bart Kiers

Reputation: 170158

The 1st grammar fails because whenever the lexer sees "[K", the lexer will enter the KEYWORD rule. If it then encounters something other then "eyword]", "i" in your case, it tries to go back to some other rule that can match "[K". But there is no other lexer rule that starts with "[K" and will therefor throw an exception. Note that the lexer doesn't remove "K" and then tries to match again (the lexer is a dumb machine)!

Your 2nd grammar works, because the lexer now can find something to fall back on when "[Ki" does not get matched by the KEYWORD since ID now includes the "[".

Upvotes: 1

Related Questions