Reputation: 63
I may have something like this:
FIRST|[some text here] (newline)
[insert text here] (newline)
SECOND|A (newline)
FIRST|[some text here] (newline)
[insert text here] (newline)
SECOND|B (newline)
FIRST|[some text here] (newline)
[insert text here] (newline)
SECOND|A (newline)
FIRST|[some text here] (newline)
[insert text here] (newline)
SECOND|B (newline)
FIRST|[some text here] (newline)
[insert text here] (newline)
SECOND|A (newline)
I only want to capture everything from FIRST
to SECOND|B
and exclude anything from FIRST
to SECOND|A
.
The order in this post is just an example and may be different with the files I am working with. The text in brackets could be words, digits, special characters, etc. (newline) is just telling you that it is on a different line.
I have tried https://regex101.com/r/CwzCyz/2 (FIRST[\s\S]+SECOND\|B
) but that gives me from the first FIRST to the last SECOND|B
This works in regex101.com but not in my PowerShell ISE application, which I am guessing is because I have the flavor set to PCRE(PHP).
Upvotes: 4
Views: 660
Reputation:
FIRST\|(?:(?!SECOND\|[^B])[\S\s])*?SECOND\|B
will not match the FIRST| associated with the SECOND|A (or any non-B)
https://regex101.com/r/e0CG9B/1
Expanded
FIRST \|
(?:
(?! SECOND \| [^B] )
[\S\s]
)*?
SECOND \| B
If there is a need for the absolute inner FIRST / SECOND that has to be done a different way :
FIRST\|(?:(?!(?:FIRST|SECOND)\|)[\S\s])*SECOND\|B
https://regex101.com/r/qoT8U1/1
Upvotes: 1
Reputation: 163257
If FIRST
is at the start of the line and SECOND|A
or SECOND|B
is at the start of the line you could match all following lines that do not start with SECOND\|[AB]
^FIRST.*(?:\r?\n(?!SECOND\|[AB]\b).*)\r?\nSECOND\|B\b.*
In parts
^FIRST.*
Start of the line(?:
Non capturing group
\r?\n(?!SECOND\|[AB]\b)
Match a newline, assert not starting with the SECOND
part.*
Match 0+ times any char except a newline)
Close non capturing group\r?\n
Match a newlineSECOND\|B\b.*
Match the line that starts with SECOND|BUpvotes: 1