Reputation: 1435
I find that preg_match_all and preg_replace do not find the same matches based on the same pattern.
My pattern is:
/<(title|h1|h2|h3|h4|h5|ul|ol|p|figure|caption|span)(.*?)><\/(\1)>/
When I run this against a snippet containing the likes of
<span class="blue"></span>
with preg_match_all I get 17 matches.
When I use the same pattern in preg_replace I get 0 matches. Replacing the \1 with the selection list does find the matches, but of course that won't work as a solution because it then doesn't ensure that the closing tag is the same type of the opening tag.
The overall goal is to find instances of tags with no content that should not be present without content...a holy crusade, I assure you.
In testing whether the regex works, I have also tried it in php cli. Here is the output:
Interactive shell
php > $str = 'abc<span class="blue"></span>def';
php > $pattern = "/<(title|h1|h2|h3|h4|h5|ul|ol|p|figure|caption|span)(.*?)><\/(\1)>/";
php > $final = preg_replace($pattern, '', $str);
php > print $final;
abc<span class="blue"></span>def
Upvotes: 0
Views: 193
Reputation: 8374
$str = 'abc<span class="blue"></span>def';
$pattern = "/<(title|h1|h2|h3|h4|h5|ul|ol|p|figure|caption|span)(.*?)><\/(\\1)>/";
// added \ ^
$final = preg_replace($pattern, '', $str);
print $final;
// echos 'abcdef'
explanation:
"\1" // <-- character in octal notation
is very different from
'\1' // <-- backslash and 1
because the first is an escape sequence. this is also the reason I almost exclusively use single quoted strings. see http://php.net/string#language.types.string.syntax.double
Upvotes: 1