Reputation: 1275
Here's my basic structure:
<div id="PrimaryContentBlock">
<form>
......
I'm trying to select elements from within the form, but XPath isn't finding anything past the primarycontentblock div.
The first query finds the parent node, but the second query finds nothing.
$dom->query('//*[@id="PrimaryContentBlock"]');
$dom->query('//*[@id="PrimaryContentBlock"]/form');
Any idea why XPath would be acting so strange? I've been seeing a lot of inconsistent behavior when working with DOMXPath queries.
Upvotes: 0
Views: 1293
Reputation: 79813
One way this could happen is if you have an XHTML document (with an xmlns
decalaration on the root html
element) and you are parsing it as XML. In such a document all the elements are part of the http://www.w3.org/1999/xhtml
namespace, and you need to specify this when querying.
Your first query, //*[@id="PrimaryContentBlock"]
, will find any element with a matching id
attribute, including those in the XHTML namespace (that’s what the *
means). The second query, //*[@id="PrimaryContentBlock"]/form
is looking for form
elements that are not in any namespace. This fails to match the document since all form
elements are in the default XHTML namespace.
The simplest way to fix this, if this is an XHTML document, is to parse it as HTML. If you currently are doing something like:
$domdocument->loadXML(...);
change it to use loadHTML
:
$domdocument->loadHTML(...);
If you want to parse the document as XML, then you need to specify the namespace in your query. First you need to register the namespace uri and prefix you are going to use with the DOMXPath
instance, then change your query to include the new prefix:
$xpath = new DOMXPath($doc);
$xpath->registerNamespace('xhtml', "http://www.w3.org/1999/xhtml");
$result = $xpath->query('//*[@id="PrimaryContentBlock"]/xhtml:form')
Upvotes: 1
Reputation: 158280
Given you have the above structure, and you are sure that the document is well-formed both of your queries WILL work:
$xml = <<<EOF
<div id="PrimaryContentBlock">
<form></form>
</div>
EOF;
$doc = new DOMDocument();
$doc->loadHTML($xml);
$selector = new DOMXPath($doc);
foreach($selector->query('//*[@id="PrimaryContentBlock"]/form') as $element) {
echo $element->nodeName;
}
Output:
form
If the following sentence is true for you:
I've been seeing a lot of inconsistent behavior when working with DOMXPath queries.
... then you have either not enough expertise with XPath, or your input data isn't well formed. At least one those both reasons apply to me when I have problems with a certain query.
Upvotes: 0