Run
Run

Reputation: 57176

DOM parser: remove tags of empty text node Vs <br/>

I have this post earlier regarding removing the html tags that have empty text node.

$dom = new DOMDocument();
$dom->loadHtml(
    '<p><strong><a href="http://xx.org.uk/dartmoor-arts">test</a></strong></p>
    <p><strong><a href="http://xx.org.uk/depw"></a></strong></p>
    <p><strong><a href="http://xx.org.uk/devon-guild-of-craftsmen"></a></strong></p>
    <p>this line has a <br/>break</p>
    '
);

$xpath = new DOMXPath($dom);


while(($nodeList = $xpath->query('//*[not(text()) and not(node())]')) && $nodeList->length > 0) {
    foreach ($nodeList as $node) {
        $node->parentNode->removeChild($node);
    }
}


echo $dom->saveHtml();

it works perfectly but I don't want it to remove <br/> tag - how can I keep it?

Upvotes: 1

Views: 2188

Answers (3)

Kirill Polishchuk
Kirill Polishchuk

Reputation: 56162

Use this XPath (it excludes br nodes):

//*[not(text() or node() or self::br)]

Upvotes: 7

Nate
Nate

Reputation: 2881

Try replacing your <br/> tags with something like [br/] and then restoring them after.

Easy enough trick :)

Upvotes: -3

Yoshi
Yoshi

Reputation: 54649

Just test the $node before removal, like:

if (!in_array($node->nodeName, array('br'))) {  // add further nodes to keep
  $node->parentNode->removeChild($node);
}

Upvotes: 3

Related Questions