Antlr MissingTokenException with simple grammar

Question

I have what I would think would be the world's simplest grammar, parsing file system paths of the form dir1/dir2/filename (without a leading /). I've cut out some detail to get a small sample exhibiting the problem.

compilationUnit : relativePath;
identifier: IdentifierStart IdentifierPart*;    
relativePath : identifier (SLASH identifier)*;

SLASH   :   '/';
fragment IdentifierPart : 'a'..'z' | 'A'..'Z' | '_' | '0'..'9';
fragment IdentifierStart :  'a'..'z' | 'A'..'Z' | '_';

if I feed it something like foo/aa/bb, I get a MissingTokenException. It identifies an identifier, then gets the SLASH, and I get a MissingTokenException hanging off the identifier. Must be something fundamental I am missing, but what ?

Bart Kiers · Accepted Answer

When you put the keyword fragment in front of a lexer rule, you cannot use this rule in parser rules. A fragment can only be used inside other lexer rules. Such rules never become tokens on their own, they can only be used as part of other tokens (other lexer rules).

In other words: remove these fragment keywords from your grammar:

// parser rules
compilationUnit : relativePath;
relativePath    : identifier (SLASH identifier)*;
identifier      : IdentifierStart (IdentifierStart | Digit)*;    

// lexer rules
SLASH           : '/';
IdentifierStart : 'a'..'z' | 'A'..'Z' | '_';   
Digit           : '0'..'9';

However, a relative path could also be made into a single token, in which case you can leave the fragment keywords but must make some parser rules into lexer rules, like this:

// parser rule
compilationUnit : RelativePath;

// lexer rules
RelativePath    : Identifier ('/' Identifier)*;

fragment Identifier      : IdentifierStart IdentifierPart*;
fragment IdentifierPart  : 'a'..'z' | 'A'..'Z' | '_' | '0'..'9';
fragment IdentifierStart : 'a'..'z' | 'A'..'Z' | '_';

But then there will never be a Identifier token created for the parser since a RelativePath matches a single Identifier. Therefor, Identifier should also be a fragment. So perhaps that is not what you want.

Antlr MissingTokenException with simple grammar

Answers (1)

Related Questions