Reputation: 1368
I have this sample code:
<?php
$string='Left text from tag <div title="hello world" class="CSS">What is <b>going on</b> here?<br> Calm up <em>right now</em>.</div> Right text. Possible another <div title="" class="DD">tag..</div> but not always.';
echo strip_tags($string);
?>
The result of this code is:
Left text from tag What is going on here? Calm up right now. Right text. Possible another tag.. but not always.
However, my goal is to REMOVE all text (including tags) between the tags removed by this strip_tags function. Ie. the result should be:
Left text from tag Right text. Possible another but not always.
I know it can be done with preg_replace, but it's too slow so maybe there is a faster solution.. (not necessarily related to strip_tags function).
Upvotes: 2
Views: 2580
Reputation: 10975
How about a DOMDocument approach?
<?php
$string='Left text from tag <div title="hello world" class="CSS">What is <b>going on</b> here?<br> Calm up <em>right now</em>.</div> Right text. Possible another <div title="" class="DD">tag..</div> but not always.';
$dom = new DomDocument();
$dom->loadHTML('<body>' . $string . '</body>');
$stripped = '';
$els = $dom->getElementsByTagName('body')->item(0)->childNodes;
$len = count($els) - 1;
foreach($els as $index => $child) {
if (is_null($child->tagName))
$stripped .= ' ' . trim($child->nodeValue);
}
$stripped = substr($stripped, 1);
echo $stripped;
Left text from tag Right text. Possible another but not always.
Upvotes: 1
Reputation: 8618
Using REGEX is the best and most compact solution in my opinion. Try this:
echo preg_replace('@<(\w+)\b.*?>.*?</\1>@si', '', $string);
If you don't want to use preg_replace, use the customized function strip_tags_content() mentioned in the manual.
function strip_tags_content($text, $tags = '', $invert = FALSE) {
preg_match_all('/<(.+?)[\s]*\/?[\s]*>/si', trim($tags), $tags);
$tags = array_unique($tags[1]);
if(is_array($tags) AND count($tags) > 0) {
if($invert == FALSE) {
return preg_replace('@<(?!(?:'. implode('|', $tags) .')\b)(\w+)\b.*?>.*?</\1>@si', '', $text);
} else {
return preg_replace('@<('. implode('|', $tags) .')\b.*?>.*?</\1>@si', '', $text);
}
} elseif($invert == FALSE) {
return preg_replace('@<(\w+)\b.*?>.*?</\1>@si', '', $text);
}
return $text;
}
echo strip_tags_content($string);
Note: I don't think yor desired output could be achieved using PHP functions only. You need to use REGEX in one way or the other.
Upvotes: 3