Reputation: 13
I have what I would think would be the world's simplest grammar, parsing file system paths of the form dir1/dir2/filename (without a leading /). I've cut out some detail to get a small sample exhibiting the problem.
compilationUnit : relativePath;
identifier: IdentifierStart IdentifierPart*;
relativePath : identifier (SLASH identifier)*;
SLASH : '/';
fragment IdentifierPart : 'a'..'z' | 'A'..'Z' | '_' | '0'..'9';
fragment IdentifierStart : 'a'..'z' | 'A'..'Z' | '_';
if I feed it something like foo/aa/bb, I get a MissingTokenException. It identifies an identifier, then gets the SLASH, and I get a MissingTokenException hanging off the identifier. Must be something fundamental I am missing, but what ?
Upvotes: 1
Views: 665
Reputation: 170257
When you put the keyword fragment
in front of a lexer rule, you cannot use this rule in parser rules. A fragment
can only be used inside other lexer rules. Such rules never become tokens on their own, they can only be used as part of other tokens (other lexer rules).
In other words: remove these fragment
keywords from your grammar:
// parser rules
compilationUnit : relativePath;
relativePath : identifier (SLASH identifier)*;
identifier : IdentifierStart (IdentifierStart | Digit)*;
// lexer rules
SLASH : '/';
IdentifierStart : 'a'..'z' | 'A'..'Z' | '_';
Digit : '0'..'9';
However, a relative path could also be made into a single token, in which case you can leave the fragment
keywords but must make some parser rules into lexer rules, like this:
// parser rule
compilationUnit : RelativePath;
// lexer rules
RelativePath : Identifier ('/' Identifier)*;
fragment Identifier : IdentifierStart IdentifierPart*;
fragment IdentifierPart : 'a'..'z' | 'A'..'Z' | '_' | '0'..'9';
fragment IdentifierStart : 'a'..'z' | 'A'..'Z' | '_';
But then there will never be a Identifier
token created for the parser since a RelativePath matches a single Identifier
. Therefor, Identifier
should also be a fragment
. So perhaps that is not what you want.
Upvotes: 2