Wanderer
Wanderer

Reputation: 317

How to get antlr grammar recognize strings with white space?

I am trying to write a grammar in antlr4. But i am not able to selectively ignore whitespaces in my rules. Attaching my grammar. Here I should allow a space(a token of WHITESPACE) if the token matches alphaNumericWSC but at all other places i want to skip the WHITESPACE.

WHITESPACE  :   [ \t\n\r]+ -> skip
alphaNumericWSC : AlphaNumeric (',' AlphaNumeric)* 
                | AlphaNumeric ('  ' AlphaNumeric)*
                ;

In other words I want to not ignore the whitespace only in this rule alphaNumericWSC.

Thanks in advance for the help.

Upvotes: 3

Views: 1995

Answers (1)

GRosenberg
GRosenberg

Reputation: 5991

The given lexer whitespace rule will consume all whitespace before it ever reaches the parser. So, if whitespace is significant to the parser, don't consume it.

ANTLR provides lexer modes that can be used to switch between whitespace sensitive and insensitive source regions. Modes do require identifying some unambiguous source features that can be used to switch between modes.

So the question is exactly when is AlphaNumeric (' ' AlphaNumeric)* valid. If there are specifically applicable markers, say := leading and ; trailing for example, define the mode:

alphaNumericWSC : AlphaNumeric (Comma AlphaNumeric)* 
                | AlphaNumeric (WS AlphaNumeric)*
                ;


AlphaNumeric  : AlphaNum ;
Mark          : ':=' -> pushMode(WSS);
Semi          : ';'  ;
Comma         : ','  ;
WHITESPACE    :   [ \t\n\r]+ -> skip;

mode WSS;
WS            : ' '+ ;
AlphaNumeric2 : AlphaNum -> type(AlphaNumeric);
Semi2         : ';'      -> type(Semi), popMode();
WHITESPACE2   : [\t\n\r]+ -> skip;

fragment AlphaNum : .... ; 

Upvotes: 2

Related Questions