user2259541
user2259541

Reputation: 1

regex multiline search pattern

I have struggled to find an answer to this. Although I'm using C++ boost regex, if I just have a working expression I can adapt it (although I'll gratefully accept a boost-specific clue).

I have the following sample text:

----
this is a sample line -> various chars
another sample line (again 'might have different chars]
etc., etc.
----
more data
again anything in here.
more lines of text -> etc
etc. etc.
----
maybe only one line

and the trailing "----" is optional.

I've tried:

^-{4}\s(.*\s)*?(-{4})+

and variations, but I'm only getting the last line in my group 2, whereas I want all lines following the 4 '-' chars in group 2, except if it is another line starting with 4'-' chars.

Upvotes: 0

Views: 218

Answers (1)

Andrew Cheong
Andrew Cheong

Reputation: 30273

Quantified capturing groups only capture the last instance. Make that group non-capturing, and wrap the entire quantified expression into a capturing group.

^-{4}\s((?:.*\s)*?)(-{4})+
       ^ ^^       ^

Also, I'm not sure what the purpose of (-{4})+ is. You may mean this instead:

^-{4}\s((?:.*\s)*?)(?=-{4}|\s*$)
                   ^^^^^^^^^^^^^

The (?= ... ) is a lookahead assertion. It asserts that immediately following the current position is either a -{4} or the end of the text (after possible whitespace).

Finally, you may want to make one tweak...

^-{4}\s+((?:.*\s+)*?)(?=-{4}|\s*$)
     ^          ^   

...in case there are blank lines between your text.

Upvotes: 0

Related Questions