Reputation: 11682
I have a HTML document (string) which contains a div with the class "foo":
<html>
<head>
...
</head>
<body>
<div class="whatever">Blabla</div>
<div>
<span>Text</span>
</div>
<table>
<tr><td><div class="foo">GARBAGE</div></td></tr>
</table>
</body>
I only would like to remove all divs with class of "foo" and this is what I have so far:
$doc = new DOMDocument();
$doc->loadHTML($myhtml);
$xpath = new DOMXpath($doc);
$all = $xpath->query("/html");
$result = remove_elements_with_class('foo', $all);
How does the remove_elements_with_class
function look like?
Upvotes: 0
Views: 1758
Reputation: 59699
After:
$xpath = new DOMXpath($doc);
You need to:
DOMNode::removeChild()
on those nodesSo, to accomplish the first task, you can issue an XPath query that finds all of the <div>
nodes whose class is foo
. That query would look like:
//div[contains(concat(' ', @class, ' '), ' foo ')]
Note that this handles the cases where an element can have more than one class, i.e. foo bar baz
and baz foo bar
. If this is undesirable, and you only want to match the class exactly (so now only a class with exactly foo
will match), the query becomes:
//div[@class = 'foo']
And, in PHP, this becomes:
$nodes = $xpath->query( "//div[contains(concat(' ', @class, ' '), ' foo ')]");
From here, you have all the nodes you want to remove in $nodes
, so just iterate over them, and remove them from the document by grabbing the <div>
's parent node, and removing its child node:
foreach( $nodes as $node) {
$node->parentNode->removeChild( $node);
}
That's all it takes! You can see it working in this demo.
Edit: To keep the <div>
and just remove the contents, set the node's nodeValue
attribute to an empty string:
foreach( $nodes as $node) {
$node->nodeValue = '';
}
You can see it working in this updated demo. You could also replace the <div>
with a newly created <div>
, as that approach seems more bulletproof, but this should work for your use-case.
Upvotes: 4