Reputation: 11
I have a string, like this:
<p>Ple ple ple BLA xo xo xo <span class="tooltip-content"><span class="tooltip-text">uuu BLA pla</span></span> he he he ha ha ha.</p>
How can I replace a word BLA
but not the one(s) within span.tooltip-content
? Only the one(s) from "outside" the span tag should be replaced.
Upvotes: -1
Views: 76
Reputation: 47764
Ideally, you'd parse the valid HTML with a legitimate parser, then use regex to only replace the search term with word boundaries (so that BLASPHEMY
wasn't accidentally corrupted).
I'll even throw in a regex conditional expression to either trim the leading or trailing whitespace with the found term (but it won't replace whitespace on both ends -- that would bad).
Code: (Demo)
$html = <<<HTML
<p>Ple ple ple BLA xo xo xo <span class="tooltip-content"><span class="tooltip-text">uuu BLA pla</span></span> he BLAZE he BLA he ha ha ha.</p>
HTML;
$find = 'BLA';
$doc = new DOMDocument();
$doc->loadHTML($html, LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);
$xpath = new DOMXPath($doc);
foreach ($xpath->query('//text()[not(ancestor::span[contains(@class, "tooltip-content")])]') as $node) {
$node->nodeValue = preg_replace("/(\s)?\b$find\b(?(1)|\s*)/", '', $node->nodeValue);
}
echo $doc->saveHTML();
Output:
<p>Ple ple ple xo xo xo <span class="tooltip-content"><span class="tooltip-text">uuu BLA pla</span></span> he BLAZE he he ha ha ha.</p>
Building a pure regex solution would require too much convolution with accommodation for nested tags, varying quotation, optional attributes, text not inside of opening tags, and other such nasties. It is much cleaner to use XPath to exclude text nodes belonging to tags which are children of tags with the class tooltip-content
.
Upvotes: 0
Reputation: 1
Not elegant, but fastest solution that come to my mind is:
then
Upvotes: 0