Reputation: 24502
I'm trying to write a regex that will remove HTML tags around a placeholder text, so that this:
<p>
Blah</p>
<p>
{{{body}}}</p>
<p>
Blah</p>
Becomes this:
<p>
Blah</p>
{{{body}}}
<p>
Blah</p>
My current regex is /<.+>.*\{\{\{body\}\}\}<\/.+>/msU
. However, it will also remove the contents of the tag preceding the placeholder, resulting in:
{{{body}}}
<p>
Blah</p>
I can't assume the users will always place the placeholder inside <p>
, so I would like it to remove any pair of tags immediately around the placeholder. I would appreciate some help with correcting my regex.
[EDIT]
I think it's important to note that the input may or may not be processed by CKEditor. It adds newlines and tabs to the opening tags, thus the regex needs to go with the /sm
(dotall + multiline) modifiers.
Upvotes: 1
Views: 4626
Reputation: 9794
does php strip_tags doesn't work for your case?
http://php.net/manual/en/function.strip-tags.php
<?php
$text = '<p>Test paragraph.</p><!-- Comment --> <a href="#fragment">Other text</a>';
echo strip_tags($text);
echo "\n";
// Allow <p> and <a>
echo strip_tags($text, '<p><a>');
?>
Upvotes: 1
Reputation: 219910
Try this:
<[^>]+>\s*\{{3}body\}{3}\s*<\/[^>]+>
See it here in action: http://regexr.com?30s4o
Here's the breakdown:
<[^>]+>
matches an opening HTML tag, and only that.\s*
captures any whitespace (equivalent to [ \t\r\n]*
)\{{3}
matches a {
exactly 3 timesbody
matches the string literally\}{3}
matches a }
exactly 3 times\s*
again, captures any whitespace<\/[^>]+>
matches a closing HTML tagUpvotes: 5