CoffeJunky
CoffeJunky

Reputation: 1087

ANTLR4 match multiple lines to a stop word, but don't consume it

I want to parse the following text file, one line is an identifier and the next (+ N lines) is the data for the identifier

!ident my.identifier(1)
!data my multi line
string data that can be very long
but does not have a end
!ident my.identifier(2)
!data just one line
!ident my.identifier(3) 

later I would like to know my.identifier=my multi line\nstring.... (so I am able to identify my identifier and the value for it)

The file starts with an identifier and then there is an alternating order of ident, data, ident, data....

I am not sure how to "start" and how to deal with the multi line thing.

My approach:

file: identData*;
identData: (ident) (data);
ident: IDENT field;
field: ~NL*;

data: DATA value;
value: //what happens here?


IDENT: '!ident';
DATA: '!data';

NL: '\r' '\n' | '\n' | '\r';

Upvotes: 0

Views: 365

Answers (1)

GRosenberg
GRosenberg

Reputation: 5991

If the ! is a proper guard character for the next data or ident statement, then just consume up to that character.

data: DATA value? ;
value: .*? ~[!]   ;

Basically, this says that value will match the longest string characters, including none, plus one that is not !. Making value optional removes the requirement for data to have a value.

Update more details

Complete solution to read multiple data / value pairs if someone needs it:

allData: (dataItem)+;
dataItem: ident  identData;
ident: IDENT field;
field: ~NL*;

itendData: DATA data;
data: .*? ~IDENT;

IDENT: '!ident';
DATA: '!data';

NL: '\r' '\n' | '\n' | '\r';

Upvotes: 1

Related Questions