Reputation: 11703
I want to check to see if an XML document contains a 'person' element anywhere inside. I can check all the first-generation elements very simply:
NodeList nodeList = root.getChildNodes();
for(int i=0; i<nodeList.getLength(); i++){
Node childNode = nodeList.item(i);
if (childNode.getNodeName() == "person") {
//do something with it
}
}
And and I can add more loops to go into subelements, but I would have to know how many nested loops to put in to determine how far into the document to drill. I could nest 10 loops, and end up with a person element nested 12 elements deep in a given document. I need to be able to pull out the element not matter how deeply nested it is.
Is there way to harvest elements from an entire document? Like return the text values of all tags as an array or iterate over it?
Something akin to python's elementtree 'findall' method perhaps:
for person in tree.findall('//person'):
personlist.append(person)
Upvotes: 7
Views: 41170
Reputation: 1568
As mmyers states, you could use recursion for this problem.
doSomethingWithAll(root.getChildNodes());
void doSomethingWithAll(NodeList nodeList)
{
for (int i = 0; i < nodeList.getLength(); i++) {
Node childNode = nodeList.item(i);
if (childNode.getNodeName().equals("person")) {
//do something with it
}
NodeList children = childNode.getChildNodes();
if (children != null)
{
doSomethingWithAll(children);
}
}
}
Upvotes: 10
Reputation: 21
Here is the formatted version:
Element root = xmlData.getDocumentElement();
NodeList children = root.getChildNodes();
public void doSomethingWithAllToConsole(NodeList nodeList, String tabs)
{
for(int i=0; i<nodeList.getLength(); i++){
//print current node & values
Node childNode = nodeList.item(i);
if(childNode.getNodeType()==Node.ELEMENT_NODE){
System.out.print(tabs + childNode.getNodeName());
if(childNode.getFirstChild()!=null
&& childNode.getFirstChild().getNodeType()==Node.TEXT_NODE
&& !StringUtil.isNullOrEmpty(childNode.getFirstChild().getNodeValue()) ){
System.out.print(" = " + childNode.getFirstChild().getNodeValue());
}
System.out.println();
}
//recursively iterate through child nodes
NodeList children = childNode.getChildNodes();
if (children != null)
{
doSomethingWithAllToConsole(children, tabs+"\t");
}
}
}
Upvotes: 2
Reputation: 220977
Apart from Document.getElementsByTagName()
or XPath
, you could also use jOOX, a library that I have created for simpler XML access and manipulation. jOOX wraps standard Java API's and adds jquery-like utility methods. Your Python code snippet would then translate to this Java code:
// Just looking for tag names
for (Element person : $(tree).find("person")) {
personlist.append(person);
}
// Use XPath for more elaborate queries
for (Element person : $(tree).xpath("//person")) {
personlist.append(person);
}
Upvotes: 0
Reputation: 26291
I see three possiblities (two of which others have answered):
Document
(that is if root
is a
Document
), you can use
Document.getElementsByTagName
Upvotes: 10
Reputation: 39606
That's what XPath is for. To get all elements named "person", here's the expression:
//person
It can be painful to use the JDK's XPath APIs directly. I prefer the wrappers that I wrote in the Practical XML library: http://practicalxml.sourceforge.net/
And here's a tutorial that I wrote (on JDK XPath in general, but mentions XPathWrapper): http://www.kdgregory.com/index.php?page=xml.xpath
Upvotes: 4