Reputation: 479

Parsing multiline strings with antlr

I have a grammar that looks like this:

a: b c d ;
b: x STRING y ;

where

STRING: '"' (~('"' | '\\' | '\r' | '\n') | '\\' ('"' | '\\'))* '"';

And my file contains one 'a' production in each line so I'm currently dropping all newlines. I would however want to parse multiline strings, how can I do that? It doesn't work if I just allow '\r' and '\n' inside the string.

Upvotes: 0

Answers (1)

GRosenberg

Reputation: 6001

IIUC, you are just looking for a multi-line string lexer rule. The fact that you are dropping newlines really does not affect the construction of the string rule. The newlines that match within the string rule will be consumed there before the lexer ever considers the whitespace rule.

STRING  : DQUOTE ( STR_TEXT | EOL )* DQUOTE ;
WS      : [ \t\r\n] -> skip;

fragment STR_TEXT: ( ~["\r\n\\] | ESC_SEQ )+ ;
fragment ESC_SEQ : '\\' ( [btf"\\] | EOF )
fragment DQUOTE  : '"' ;
fragment EOL     : '\r'? '\n' ;

Upvotes: 2

Parsing multiline strings with antlr

Answers (1)

Related Questions