DT7
DT7

Reputation: 1609

getElementsByTagName searching down all levels of XML nodes

I have this XML file:

<root>
    <node1>
        <name>A</name>
        <node2>
            <name>B</name>
            <node3>
                <name>C</name>
                <number>001</number>
            </node3>
        </node2>
    </node1>
</root>

I am parsing the file, to get the name for each node, and the corresponding number if existing.

I use:

String number = eElement.getElementsByTagName("number").item(0).getTextContent();

This should give me something like:

Name | Number
A    | 
B    |
C    | 001

But I get:

Name | Number
A    | 001
B    | 001
C    | 001

So, I think the getElementsByTagName("Number") is looking for number node in all the children of a node. I don't want that. Does anybody know a workaround?

I thought of using XPath instead of the above method, but I really want to know if there's an existing way. Thanks

Upvotes: 1

Views: 8641

Answers (3)

bdoughan
bdoughan

Reputation: 149037

You can use the javax.xml.xpath APIs in the JDK/JRE to have much more control over the XML returned over getElementsByTagName.

import java.io.File;
import javax.xml.parsers.*;
import javax.xml.xpath.*;
import org.w3c.dom.*;

public class Demo {

    public static void main(String[] args) throws Exception {
        DocumentBuilderFactory docBuilderFactory = DocumentBuilderFactory.newInstance();
        DocumentBuilder docBuilder = docBuilderFactory.newDocumentBuilder();
        Document document = docBuilder.parse(new File("filename.xml"));

        XPathFactory xpathFactory = XPathFactory.newInstance();
        XPath xpath = xpathFactory.newXPath();
        Element element = (Element) xpath.evaluate("//node3/name", document, XPathConstants.NODE);
    }

}

Upvotes: 2

Chriki
Chriki

Reputation: 16378

Assuming that your eElement variable is always one of the <node1/>, <node2/>, … elements in question, then the following code should work when you replace your own snippet mentioned above:

String number = null;
NodeList childNodes = eElement.getChildNodes();
for (int i = 0; i < childNodes.getLength(); i++) {
  Node node = childNodes.item(i);
  if (node.getNodeType() == Node.ELEMENT_NODE
      && node.getNodeName().equals("number")) {
    number = node.getTextContent();
    break;
  }
}

The number variable will be null when there is no <number/> child; it will contain the number you need otherwise.

Upvotes: 0

Naveen
Naveen

Reputation: 367

Hope this helps,

import java.io.File;
import java.io.IOException;

import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;


public class XML {

public static void main(String[] args) throws IOException {
    File input = new File("D:\\sample.xml");
    Document doc = Jsoup.parse(input, "UTF-8");
    Elements allElements = doc.select("root");
    for(Element value : allElements){
        System.out.println(value.text());
    }
    String node3Num = doc.select("node3").tagName("number").text();

        System.out.println(node3Num);
}

}

Output: A B C 001 C 001

I have used jsoup-1.7.2 jar (you can download from jsoup.org)

Upvotes: 1

Related Questions