ANTLR Token Priority?

Question

I'm trying to parse a javadoc-style syntax in the following format:

/**
 * this is description text
 * this is description text also
 * @name ID
 * @param one
 */

Here's my grammar:

query_comment       :   BEGIN_QDOC (description_text | NOMANSLAND)*
            name_declaration 
            (param_declaration | INNER_WS | NOMANSLAND)* 
            END_QDOC ;

name_declaration    :   NAME_KEY INNER_WS ID;
param_declaration   :   PARAM_KEY INNER_WS ID;
description_text    :   ~('
')+;


BEGIN_QDOC          :   '/**';

END_QDOC            :   ('*/' | NASTY_GARBAGE '*/');


/*
 * Stupid keywords.
 */
NAME_KEY            :   '@name';
PARAM_KEY           :   '@param'; 

/*
 * Defines what constitutes a valid identifier.
 */
ID          :   ('a'..'z' | 'A'..'Z' | '0'..'9' | '-' | '_' | '?')+ ;

/*
 * White space and garbage definitions.
 */
 NOMANSLAND         :    NASTY_GARBAGE '*';

fragment NASTY_GARBAGE  :   '
'? '
' (INNER_WS)?;

INNER_WS            :   (' ' |'	')+;

What I don't understand is why the description text is not parsing properly. It appears to be breaking up the description text block into ID and INNER_WS tokens, which be doesn't make any sense to me since ~(' ') ought to come first in priority and be applied first. Instead 'this' 'is' 'description' 'text' matches ID tokens, which means it can't contain punctuation.

ANTLR Token Priority?

Answers (1)

Related Questions