Reputation: 1739
I have a regex of ((.*)\n)*?stopcondition
The aim of this regex is to match a number of lines until the stop conditions and then to replace the stopcondition.
For example
a
b
stop condition
becomes
a
b
changed condition
Another example:
a
b
c
d
stop condition
becomes
a
b
c
d
changed condition
The issue I'm having is using a nested back reference to all the lines captured before the stop condition.
I've currently resorted to writing 2 regex's to handle the case of 2 lines before and 4 lines before.
Is there some syntactic sugar I'm missing to get a reference to the entire match?
If I use a standard $ back reference in this situation it will just match the end last line found before the stop condition.
Upvotes: 2
Views: 201
Reputation:
Imagine what this says ((.*)\n)*?stopcondition
.
This will match anything before stopcondition
no matter what it is !!
So, ((.*)\n)*?
is totally useless since engines always match the first
available (left to right in the source) regex specified literal.
Even if it were to contain something that needs to be before stopcondition
it is just being replaced without any modification.
In that case, since you're using Perl, use the \K
construct.
(Note- some other engines that use PCRE or it's style have this construct
along with it's brother's (*SKIP)(*FAIL) )
Definition:
\K Keep the stuff left of the \K, don't include it in $&
The stuff is consumed, but not part of the match.
This insures you're matching the right stopcondition
but doesn't include
the matched stuff before it.
Find: ((.*)\n)*?\Kstopcondition
Replace: changedcondition
Analyze this ((.*)\n)*?
now.
( # (1 start)
( .* ) # (2)
\n
)*? # (1 end)
Group 1 is overwritten on each quantified pass of ()
*?
so, you only ever see what matched on the the very last pass.
In this however,
( # (1 start)
( # (2 start)
( .* ) # (3)
\n
)*? # (2 end)
) # (1 end)
Group 1 is not quantified, and contains the entirety of the accumulation of groups
2 and 3 which are overwritten.
P.S. Get some software that knows how to format, analyze, test and benchmark regex.
regexformat.com
Upvotes: 1
Reputation: 667
How about this:
^((?:.|\n)*\n)stop condition
(replace with: $1changed condition
)
This looks for the beginning of a line, followed by any number of characters or newlines, and then a newline and a stop condition. The inner group is a non-capturing group ((?:stuff)
) because we only care about capturing the whole chunk of stuff that came before.
If you don't care about starting at the beginning of a line and the stop condition being on its own line you can use the slightly simpler
((?:.|\n)*)stop condition
Although if stop condition
is a unique string that appears nowhere else in the file, you could just do a straight search and replace for stop condition
and changed condition
.
Upvotes: 2