Reputation: 1619
I have this html in my database:
<p>some text 1</p>
<img src=\"http://www.example.com/images/some_image_1.jpg\">
<p>some text 2</p>
<p>some text 3</p>
<img src=\"http://www.example.com/images/some_image_2.jpg\">
<p>some text 4</p>
<p>some text 5</p>
<img src=\"http://www.example.com/images/some_image_3.jpg\">
Conditionally, I need to remove some specific <img>
tag. So I don't want to remove all <img>
tags, but only specific ones.
I have tried this, but it will remove all <img>
tags, even if I do not want that:
$dom = new \DOMDocument;
$dom->preserveWhiteSpace = false;
$dom->loadHTML($html);
$nodes = $dom->getElementsByTagName("img");
for($i = 0; $i < $nodes->length; $i++) {
if ($i == 1) {
continue;
}
$image = $nodes->item($i);
$image->parentNode->removeChild($image);
}
return $dom->saveHTML();
Can someone help me with this ? In this html example, let's say that I want to remove first and third image in text, but to leave second one.
Also, I have noticed that saveHTML()
method is adding <html><body>
tags to my html, and I do not want that. I don't see any option to turn this off. Any help there too ?
Thanks in advance, I'm stuck with this for hours.
Upvotes: 1
Views: 4581
Reputation: 863
The above ones weren't working for me. From comments in the documentation, it mentions
You can't remove DOMNodes from a DOMNodeList as you're iterating over them in a foreach loop
https://www.php.net/manual/en/domnode.removechild
$domNodeList = $domDocument->getElementsByTagname('p');
$domElemsToRemove = array();
foreach ( $domNodeList as $domElement ) {
// ...do stuff with $domElement...
$domElemsToRemove[] = $domElement;
}
foreach( $domElemsToRemove as $domElement ){
$domElement->parentNode->removeChild($domElement);
}
That style worked for me for removing some tags
Upvotes: 0
Reputation: 1305
there are option to avoid adding html and body tag when you want to load an html file or content:
$dom = new DOMDocument;
$dom->preserveWhiteSpace = false;
@$dom->loadHTML(file_get_contents('file.html'), LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);
//@$dom->loadHTMLFile('file.html'); //Adds Html and body tags if not exist at the beginning
$nodes = $dom->getElementsByTagName("img");
foreach($nodes as $i => $node){
if ($i == 1) {
continue;
}
$image = $nodes->item($i);
$image->parentNode->removeChild($image);
}
return $dom->saveHTML();
//$dom->saveHtmlFile('file.html');
some answers close to your question's answer which used in this answer:
Upvotes: 1
Reputation: 123
You can do this by using array. I modified your code this will not remove second img tag.
$dom = new \DOMDocument;
$dom->preserveWhiteSpace = false;
$dom->loadHTML($html);
// Declare array with numeric vlaues
$remainImages = array(1);
$nodes = $dom->getElementsByTagName("img");
for($i = 0; $i < $nodes->length; $i++) {
if (!in_array($i,$remainImages) {
$image = $nodes->item($i);
$image->parentNode->removeChild($image);
}
}
return $dom->saveHTML();
Upvotes: 1