How to parse repeated attributes with antlr?

Question

I have the following grammar.

meta : '<' TAG attribute* '>';

attribute : NAME '=' VAL;

TAG : [A-Z0-9]+;

NAME : [A-Z_-]+;

VAL : '"'.*?'"';

I want to match the below string.

But I am getting the following error.

ParseError extraneous input 'CONTENT' expecting {'>', NAME}  clj-antlr.common/parse-error (common.clj:146)

I am able to parse with one attribute.

How to parse repeated attributes? Giving attribute* has no effect.

Update: It's actually caused by the lexer. If I combine TAG and NAME then it works.

meta : '<' NAME attribute* '>';
NAME : [A-Z0-9_-]+;

But I don't want to have NAME to contain numbers. Is there a way to make this work?

Raven · Accepted Answer

You can use two independent lexer rules and then use a parser rule to combine them respectively

ID: [A-Za-z]+ ;
NUMBER: [0-9]+ ;

tag: ID+ tag? | NUMBER+ tag? ;
name: ID+ name?  | ('_' | '-')+ name?

If you have problems with whitespace between the elements being ignored you can use a different channel for it an enable it only in the above parser rules... It might even work to define the above parser rules as lexer rules but I'm not sure of that...

How to parse repeated attributes with antlr?

Answers (1)

Related Questions