Reputation: 2072
I am want to create an output text filter to replaces all the <img>
elements in the DOM with the following text "no images allowed
".
I.e.: If the user creates this HTML markup:
<p><img src="/image.jpg" /></p>
the following HTML is rendered:
<p>no images allowed</p>
Please note that I cannot use preg_replace
. The question is simplified and I need to parse the DOM to to find what images to disallow.
Thanks to this answer, I found that getElementsByTagName()
returns "live" iterator, so you need two steps, so I have this:
foreach ($elements as $element) {
$domArray[] = $element;
$src= $element->getAttribute('src');
$frag= $dom->createElement('p');
$frag->nodeValue = 'no images allowed';
$element->parentNode->appendChild($frag);
}
// loop through the array and delete each node
$nodes = iterator_to_array($dom->getElementsByTagName('img'));
foreach ($nodes as $node) {
$node->parentNode->removeChild($node);
}
$newtext = $dom->saveHTML();
It almost do what I want, but I get this:
<p><p>no images allowed</p></p>
Upvotes: 2
Views: 118
Reputation: 19224
I would fetch the elements with xpath, then replace with newly created text nodes.
$xp = new DOMXPath($dom);
$elements = $xp->query('//img');
foreach ($elements as $element) {
$frag= $dom->createTextNode('no images allowed');
$element->parentNode->insertBefore($frag, $element);
$element->parentNode->removeChild($element);
}
echo $dom->saveHtml();
Demo here: http://codepad.org/w9uj0ez9
Upvotes: 2
Reputation: 646
To remove HTML self-enclosed img tag you may use a simple regular expression:
<?php
function no_images_allowed($text) {
return preg_replace('/<img[^>]*>/', 'no images allowed', $text);
}
print no_images_allowed('<p><img src="/image.jpg" /></p>');
It is simpler and should be much more efficient, you do not need to travers over every DOM element, just process plain text.
Regex in example above will only work for self-enclosed img tag:
<img src="..."/>
<img src="...">
Please note that it will not work for example with:
<img src="..."></img>
<IMG SRC="..."/>
<img src="...">invalid content</img>
If you want to include every possible case (even invalid ones) then proposed regex should be modified.
Upvotes: 2