Reputation: 51
I'm not really good with regex (i'm on this one for hours) and I struggle to replace all empty lines between 2 identifier ("{|" and "|}")
My regex look like that (sorry for your eyes) : (\{\|)((?:(?!\|\}).)+)(?:\n\n)((?:(?!\|\}).)+)(\|\})
(\{\|)
: the character "{|"((?:(?!\|\}).)+)
: Everything if not after "|}" (negative lookahead)(?:\n\n)
: The empty line I want to delete((?:(?!\|\}).)+)
: Everything if not after "|}" (negative lookahead)(\|\})
: the character "|}"It works, but it delete only the last empty line, can you help me to make it work with all the empty lines ?
I tryed to add a negative lookahead on \n\n with a repeating group on everything but it did not work.
Upvotes: 1
Views: 457
Reputation: 89557
Several ways:
The \G
based pattern: (only one pattern is needed)
$txt = preg_replace('~ (?: \G (?!\A) | \Q{|\E ) [^|\n]*+ (?s: (?! \Q|}\E | \n\n) . [^|\n]*)*+ \n \K \n+ ~x', '', $txt);
The \G
matches the start of the string or the position in the string after the last successful match. This ensures that several matches are contigous.
What I call a \G
based pattern can be schematized like that:
(?: \G position after a successful match | first match beginning ) reach the target \K target
The "reach the target" part is designed to never match the closing sequence |}
. So once the last target is found, the \G
part will fail until the first match part succeeds again.
~
### The beginning
(?:
\G (?!\A) # contigous to a successful match
|
\Q{|\E # opening sequence
#; note that you can add `[^{]* (*SKIP)` before to quickly avoid
#; all failing positions
#; note that if you want to check that the opening sequence is followed by
#; a closing sequence (without an other opening sequence), you can do it
#; here using a lookahead
)
### lets reach the target
#; note that all this part can also be written like that `(?s:(?!\|}|\n\n).)*`
#; or `(?s:[^|\n]|(?!\|}|\n\n).)*`, but I choosed the unrolled pattern that is
#; more efficient.
[^|\n]*+ # all that isn't a pipe or a newline
# eventually a character that isn't the start of |} or \n\n
(?s:
(?! \Q|}\E | \n\n ) # negative lookahead
. # the character
[^|\n]*
)*+
#; adding a `(*SKIP)` here can also be usefull if there's no more empty lines
#; until the closing sequence
### The target
\n \K \n+ # the \K is a conveniant way to define the start of the returned match
# result, this way, only \n+ is replaced (with nothing)
~x
or preg_replace_callback
: (more simple)
$txt = preg_replace_callback('~\Q{|\E .*? \Q|}\E~sx', function ($m) {
return preg_replace('~\n+~', "\n", $m[0]);
}, $txt);
Upvotes: 3
Reputation: 106553
You can use a positive lookahead pattern to ensure that a matching blank line is followed by |}
, but also use a negative lookahead pattern to ensure that none of the characters between the blank line and the |}
is the starting position of a {|
:
\n{2,}(?=(?:(?!\{\|).)*?\|\})
Demo: https://regex101.com/r/oWfkg1/8
Upvotes: 2
Reputation: 20737
If you use:
(?<={\|)(\n{2,}|(\r\n){2,}|\s+)(?=\|})
Then it will match new lines and empty space found between {|
and |}
Upvotes: 1