Reputation: 21
How do you inverse a Regex expression in PHP?
This is my code:
preg_match("!<div class=\"foo\">.*?</div>!is", $source, $matches);
This is checking the $source String for everything within the Container and stores it in the $matches variable.
But what I want to do is reversing the expression i.e. I want to get everything that is NOT inside the container. I know there is something called negative lookahead, but I am really bad with Regular expressions and didn't manage to come up with a working solution.
Simply using ?!
preg_match("?!<div class=\"foo\">.*?</div>!is", $source, $matches);
Does not seem to work.
Thanks!
Upvotes: 1
Views: 241
Reputation: 56809
Since your goal is to remove the matching divs, as mentioned in the comment, using the original regex with preg_split
, plus implode
would be the simpler solution:
implode('', preg_split('~<div class="foo">.*?</div>~is', $text))
I'm not sure whether this is a good idea, but here is my solution:
~(.*?)(?:<div class="foo">.*?</div>|$)~is
The result can be picked out from capturing group 1 of each matches.
Note that the last match is always an empty string, and there can be empty string match between 2 matching divs or if the string starts with matching div. However, you need to concatenate them anyway, so it seems to be a non-issue.
The idea is to rely on the fact that lazy quantifier .*?
will always try the sequel (whatever comes after it) first before advancing itself, resulting in something similar to look-ahead assertion that makes sure that whatever matched by .*?
will not be inside <div class="foo">.*?</div>
.
The div tag is matched along in each match in order to advance the cursor past the closing tag. $
is used to match the text after the last matching div.
The s
flag makes .
matches any character, including line separators.
Revision: I had to change .+?
to .*?
, since .+?
handle strings with 2 matching div next to each other and strings start with matching div.
Anyway, it's not a good idea to modify HTML with regular expression. Use a parser instead.
Upvotes: 1
Reputation: 67968
<div class=\"foo\">.*?</div>\K|.
You can simply do this by using \K
.
\K resets the starting point of the reported match. Any previously consumed characters are no longer included in the final match
Upvotes: 0