Reputation: 7594
I am trying to parse a document using Dom4J. This document comes from various providers, and sometimes comes with namespaces and sometimes without.
For eg:
<book>
<author>john</author>
<publisher>
<name>John Q</name>
</publisher>
</book>
or
<book xmlns="http://schemas.xml.com/XMLSchemaInstance">
<author>john</author>
<publisher>
<name>John Q</name>
</publisher>
</book>
or
<book xmlns:i="http://schemas.xml.com/XMLSchemaInstance">
<i:author>john</i:author>
<i:publisher>
<i:name>John Q</i:name>
</i:publisher>
</book>
I have a list of XPaths. I parse the document into a Document class, and then search on it using the xpaths.
Document doc = parseDocument(documentFile);
List<String> XmlPaths = new List<String>();
XmlPaths.add("book/author");
XmlPaths.add("book/publisher/name");
for (int i = 0; i < XmlPaths.size(); i++)
{
String searchPath = XmlPaths.get(i);
Node currentNode = doc.selectSingleNode(searchPath);
assert(currentNode != null);
}
This code does not work on the last document, the one that is using namespace prefixes.
I tried these techniques, but none of them seem to work.
1) changing the last element in the xpath to be namespace neutral:
/book/:author
/book/[local-name()='author']
/[local-name()='book']/[local-name()='author']
All of these throw an exception saying that the XPATH format is not correct.
2) Adding namespace uris to the XPAth, after creating it using DocumentHelper.createXPath();
Any idea what I am doing wrong?
FYI I am using dom4j version 1.5
Upvotes: 1
Views: 151
Reputation: 4238
Your XPath does not contain a tag name. The general syntax in your case would be
/TAGNAMEPARENT[CONDITION_PARENT]/TAGNAMECHILD[CONDITION_CHILD]
The important aspect is that the tag names are mandatory while the conditions are optional. If you do not want to specify a tag name you have use *
for "any tag". There may be performance implications for large XML files since you will always have to iterate over a node set instead of using an index lookup. Maybe @MichaelKay can comment on this.
Try this instead:
/*[local-name()='book']/*[local-name()='author']
Upvotes: 2