Grad van Horck
Grad van Horck

Reputation: 4506

Get HTML-tags by namespace in PHP XPath Query

Let's say I have the following HTML snippet:

<div abc:section="section1">
  <p>Content...</p>
</div>
<div abc:section="section2">
  <p>Another section</p>
</div>

How can I get a DOMNodeList (in PHP) with a DOMNode for each of <div>'s with the abc:section attribute set.

Currently I have the following code

$dom = new DOMDocument();
$dom->loadHTML($html)

$xpath = new DOMXPath($dom);
$xpath->registerNamespace('abc', 'http://xml.example.com/AbcDocument');

Following XPath's won't work:

$xpath->query('//@abc:section');
$xpath->query('//*[@abc:section]');

The loaded HTML is always just a snippet, I'm transforming this using the DOMDocument functions and feeding that to the template.

Upvotes: 0

Views: 1345

Answers (1)

Gordon
Gordon

Reputation: 316989

The loadHTML method will trigger the HTML Parser module of libxml. Afaik, the resulting HTML tree will not contain namespaces, so querying them with XPath wont work here. You can do

$dom = new DOMDocument();
$dom->loadHtml($html);
$xpath = new DOMXPath($dom);
foreach ($dom->getElementsByTagName('div') as $node) {
    echo $node->getAttribute('abc:section');
}
echo $dom->saveHTML();

As an alternative, you can use //div/@* to fetch all attributes and that would include the namespaced attributes. You cannot have a colon in the query though, because that requires the namespace prefix to be registered but like pointed out above, that doesnt work for an HTML tree.

Yet another alternative would be to use //@*[starts-with(name(), "abc:section")].

Upvotes: 1

Related Questions