Xaver
Xaver

Reputation: 11682

Remove a DomNode with a certain class in PHP

I have a HTML document (string) which contains a div with the class "foo":

<html>
<head>
  ...
</head>
<body>
<div class="whatever">Blabla</div>
<div>
   <span>Text</span>
</div>
<table>
   <tr><td><div class="foo">GARBAGE</div></td></tr>
</table>
</body>

I only would like to remove all divs with class of "foo" and this is what I have so far:

$doc = new DOMDocument();
$doc->loadHTML($myhtml);
$xpath = new DOMXpath($doc);
$all = $xpath->query("/html");

$result = remove_elements_with_class('foo', $all);

How does the remove_elements_with_class function look like?

Upvotes: 0

Views: 1758

Answers (1)

nickb
nickb

Reputation: 59699

After:

$xpath = new DOMXpath($doc);

You need to:

  1. Select all the nodes that you want to remove
  2. Call DOMNode::removeChild() on those nodes

So, to accomplish the first task, you can issue an XPath query that finds all of the <div> nodes whose class is foo. That query would look like:

//div[contains(concat(' ', @class, ' '), ' foo ')]

Note that this handles the cases where an element can have more than one class, i.e. foo bar baz and baz foo bar. If this is undesirable, and you only want to match the class exactly (so now only a class with exactly foo will match), the query becomes:

//div[@class = 'foo']

And, in PHP, this becomes:

$nodes = $xpath->query( "//div[contains(concat(' ', @class, ' '), ' foo ')]");

From here, you have all the nodes you want to remove in $nodes, so just iterate over them, and remove them from the document by grabbing the <div>'s parent node, and removing its child node:

foreach( $nodes as $node) {
    $node->parentNode->removeChild( $node);
}

That's all it takes! You can see it working in this demo.

Edit: To keep the <div> and just remove the contents, set the node's nodeValue attribute to an empty string:

foreach( $nodes as $node) {
    $node->nodeValue = '';
}

You can see it working in this updated demo. You could also replace the <div> with a newly created <div>, as that approach seems more bulletproof, but this should work for your use-case.

Upvotes: 4

Related Questions