darenn
darenn

Reputation: 479

Parsing multiline strings with antlr

I have a grammar that looks like this:

a: b c d ;
b: x STRING y ;

where

STRING: '"' (~('"' | '\\' | '\r' | '\n') | '\\' ('"' | '\\'))* '"';

And my file contains one 'a' production in each line so I'm currently dropping all newlines. I would however want to parse multiline strings, how can I do that? It doesn't work if I just allow '\r' and '\n' inside the string.

Upvotes: 0

Views: 1155

Answers (1)

GRosenberg
GRosenberg

Reputation: 6001

IIUC, you are just looking for a multi-line string lexer rule. The fact that you are dropping newlines really does not affect the construction of the string rule. The newlines that match within the string rule will be consumed there before the lexer ever considers the whitespace rule.

STRING  : DQUOTE ( STR_TEXT | EOL )* DQUOTE ;
WS      : [ \t\r\n] -> skip;

fragment STR_TEXT: ( ~["\r\n\\] | ESC_SEQ )+ ;
fragment ESC_SEQ : '\\' ( [btf"\\] | EOF )
fragment DQUOTE  : '"' ;
fragment EOL     : '\r'? '\n' ;

Upvotes: 2

Related Questions