directedition
directedition

Reputation: 11703

Iterate all XML node generations in java DOM

I want to check to see if an XML document contains a 'person' element anywhere inside. I can check all the first-generation elements very simply:

NodeList nodeList = root.getChildNodes();
for(int i=0; i<nodeList.getLength(); i++){
  Node childNode = nodeList.item(i);
  if (childNode.getNodeName() == "person") {
     //do something with it
  }
}

And and I can add more loops to go into subelements, but I would have to know how many nested loops to put in to determine how far into the document to drill. I could nest 10 loops, and end up with a person element nested 12 elements deep in a given document. I need to be able to pull out the element not matter how deeply nested it is.

Is there way to harvest elements from an entire document? Like return the text values of all tags as an array or iterate over it?

Something akin to python's elementtree 'findall' method perhaps:

for person in tree.findall('//person'):
   personlist.append(person)

Upvotes: 7

Views: 41170

Answers (5)

user125661
user125661

Reputation: 1568

As mmyers states, you could use recursion for this problem.

doSomethingWithAll(root.getChildNodes());

void doSomethingWithAll(NodeList nodeList)
{
    for (int i = 0; i < nodeList.getLength(); i++) {
        Node childNode = nodeList.item(i);
        if (childNode.getNodeName().equals("person")) {
            //do something with it
        }

        NodeList children = childNode.getChildNodes();
        if (children != null)
        {
            doSomethingWithAll(children);
        }
    }
}

Upvotes: 10

parser
parser

Reputation: 21

Here is the formatted version:

Element root = xmlData.getDocumentElement();  
NodeList children = root.getChildNodes(); 

public void doSomethingWithAllToConsole(NodeList nodeList, String tabs)
{
    for(int i=0; i<nodeList.getLength(); i++){

      //print current node & values
      Node childNode = nodeList.item(i);
      if(childNode.getNodeType()==Node.ELEMENT_NODE){
          System.out.print(tabs + childNode.getNodeName());
          if(childNode.getFirstChild()!=null 
                  && childNode.getFirstChild().getNodeType()==Node.TEXT_NODE
                  && !StringUtil.isNullOrEmpty(childNode.getFirstChild().getNodeValue()) ){
              System.out.print(" = " + childNode.getFirstChild().getNodeValue());
          }
          System.out.println();
      }

      //recursively iterate through child nodes
      NodeList children = childNode.getChildNodes();
      if (children != null)
      {
          doSomethingWithAllToConsole(children, tabs+"\t");
      }
    }
}

Upvotes: 2

Lukas Eder
Lukas Eder

Reputation: 220977

Apart from Document.getElementsByTagName() or XPath, you could also use jOOX, a library that I have created for simpler XML access and manipulation. jOOX wraps standard Java API's and adds jquery-like utility methods. Your Python code snippet would then translate to this Java code:

// Just looking for tag names
for (Element person : $(tree).find("person")) {
  personlist.append(person);
}

// Use XPath for more elaborate queries
for (Element person : $(tree).xpath("//person")) {
  personlist.append(person);
}

Upvotes: 0

Kathy Van Stone
Kathy Van Stone

Reputation: 26291

I see three possiblities (two of which others have answered):

  1. Use recursion.
  2. Use XPath (might be a bit overkill for this problem, but if you have a lot of queries like this it is definitely something to explore). Use kdgregory's help on that; a quick look at the api indicated that it is a bit painful to use directly.
  3. If what you have is in fact a Document (that is if root is a Document), you can use Document.getElementsByTagName

Upvotes: 10

kdgregory
kdgregory

Reputation: 39606

That's what XPath is for. To get all elements named "person", here's the expression:

//person

It can be painful to use the JDK's XPath APIs directly. I prefer the wrappers that I wrote in the Practical XML library: http://practicalxml.sourceforge.net/

And here's a tutorial that I wrote (on JDK XPath in general, but mentions XPathWrapper): http://www.kdgregory.com/index.php?page=xml.xpath

Upvotes: 4

Related Questions