Reputation: 63
I am trying to use XPath (Java) to get all unknown nodes based on unknown attributes starting with a specific value. For some reason, it is not returning a node that contains an attribute named value
. I also tested at www.freeformatter.com/xpath-tester.html and got the same result. Here is what I have:
XML:
<div>
<object data="/v1/assets/mp4Video-1" type="video/mp4">
<param name="webmSource" value="/v1/assets/webmVideo-1" type="REF"/>
</object>
</div>
XPath Expression:
//*[starts-with(@*, '/v1/assets/')]
Results - returns the <object>
, but not the <param>
.
Now, if I change the XPath expression to //*[starts-with(@*, '/v1/assets/') or starts-with(@value, '/v1/assets/')]
, it returns both as expected.
I guess my question is, what is it about the value
attribute that causes XPath to not properly recognize it as an attribute, or to not return the element when the value
attribute contains the value I am querying for?
Upvotes: 4
Views: 1730
Reputation: 22617
The reason why your original path expression:
//*[starts-with(@*, '/v1/assets/')]
does not work has to do with how functions in XPath 1.0 cope with more nodes than expected. The starts-with()
function expects a single node as its first argument, and a string (or a node that evaluates to a string) as its second argument.
But in the expression above, starts-with()
is handed a set of attribute nodes, @*
, as its first argument. In this case, only the first of those attribute nodes is used by this function. All other nodes in the set are ignored. Since the order of attributes is not defined in XML, the XPath engine is free to choose any attribute node to be used in the function. But your specific XPath engine (and many others) appear to consistently use the first attribute node, in the order of their appearance.
To illustrate this (and to prove it), change your input document to
<div>
<object data="other" type="/v1/assets/mp4Video-1">
<param name="/v1/assets/webmVideo-1" value="other" type="REF"/>
</object>
</div>
as you can see, I have changed the order of attributes, and the attribute containing /v1/assets/
is now the second attribute of the object
element, and vice versa for the param
element. Using this input document, your original XPath expression will only return the param
element.
Again, this behaviour is not necessarily consistent between different XPath engines! Using other implementations of XPath might yield different results.
The XPath expression that does what you need is
//*[@*[starts-with(., '/v1/assets/')]]
in plain English, it says
select elements anywhere in the document, but only if, among all attribute nodes of an element, there is an attribute whose value starts with "/v1/assets/".
Upvotes: 3
Reputation: 3903
Try
//@*[starts-with(., '/v1/assets/')]
Returns all the attributes
//*[@*[starts-with(., '/v1/assets/')]]
Returns all the Elements
This will search all attributes for all nodes.
Upvotes: 2