Pale Blue Dot
Pale Blue Dot

Reputation: 591

Read few xml elements only in an efficient way

I want to read only few XML tag values .I have written the below code.XML is big and a bit complex. But for example I have simplified the xml . Is there any other efficient way to solve it ?I am using JAVA 8

DocumentBuilderFactory dbfaFactory = DocumentBuilderFactory.newInstance();
        DocumentBuilder documentBuilder = dbfaFactory.newDocumentBuilder();
        Document doc = documentBuilder.parse("xml_val.xml");
        
        
        System.out.println(doc.getElementsByTagName("date_added").item(0).getTextContent());



<item_list id="item_list01">
   <numitems_intial>5</numitems_intial>
   <item>
     <date_added>1/1/2014</date_added>
     <added_by person="person01" />
   </item>
   <item>
      <date_added>1/6/2014</date_added>
      <added_by person="person05" />
    </item>
    <numitems_current>7</numitems_current>
    <manager person="person48" />
</item_list>

Upvotes: 1

Views: 428

Answers (2)

Michael Kay
Michael Kay

Reputation: 163418

Some suggestions.

Firstly, don't use DOM. There's a wide range of dom-like XML tree representations available in Java; DOM is the first and the worst. Later third-party models like JDOM2 and XOM are much better designed.

Secondly, consider doing the whole thing in an XML-oriented language like XSLT or XQuery rather than in Java. In XQuery, using Saxon's XQuery API, this would be:

Processor proc = new Processor(false);
XQueryCompiler comp = proc.newXQueryCompiler();
XQueryExecutable exec = comp.compile("//date_added");
XQueryEvaluator eval = exec.load();
eval.setSource(new StreamSource(new File("/home/luis/tmp/test.xml")));
for (XdmItem item : eval.evaluate()) {
  System.out.println(item.getStringValue());
}

But since the query is so simple, Saxon also has a direct map/reduce style API to access the tree. This would be:

Processor proc = new Processor(false);
XdmNode doc = proc.newDocumentBuilder().build(
  new StreamSource(new File("/home/luis/tmp/test.xml")));
for (XdmItem item : doc.select(descendant("date_added")).asList()) {
  System.out.println(item.getStringValue());
} 

A suggestion that has nothing to do with efficiency: please use international standard dates. 1/6/2014 could be 1st June or 6th January. Writing it as 2014-06-01 (or 2014-01-06 if that's what you intended) not only avoids the kind of dangerous bugs that arise if you use an ambiguous format, it also means you can use standard date-and-time processing libraries, such as the XPath 2.0+ function library.

Upvotes: 1

LMC
LMC

Reputation: 12767

Using XPAth and passing a specific expression to get the desired element

public class MainJaxbXpath {

    public static void main(String[] args) {
        try {
            FileInputStream fileIS;
            fileIS = new FileInputStream("/home/luis/tmp/test.xml");

            DocumentBuilderFactory builderFactory = DocumentBuilderFactory.newInstance();
            DocumentBuilder builder;
            builder = builderFactory.newDocumentBuilder();

            Document xmlDocument;
            xmlDocument = builder.parse(fileIS);

            XPath xPath = XPathFactory.newInstance().newXPath();
            String expression = "//item_list[@id=\"item_list01\"]//date_added[1]";
            String nodeList =(String) xPath.compile(expression).evaluate(xmlDocument, XPathConstants.STRING);
            System.out.println(nodeList);
        } catch (SAXException | IOException | ParserConfigurationException | XPathExpressionException e3) {
            e3.printStackTrace();
        }

    }

}

Result:

1/1/2014

To look for more than one element on the same operation

        String expression01 = "//item_list[@id=\"item_list01\"]//date_added[1]";
        String expression02 = "//item_list[@id=\"item_list02\"]//date_added[2]";
        String expression = String.format("%s | %s", expression01, expression02);
        NodeList nodeList =(NodeList) xPath.compile(expression).evaluate(xmlDocument, XPathConstants.NODESET);
        for (int i = 0; i < nodeList.getLength(); i++) {
            Node currentNode = nodeList.item(i);
            if (currentNode.getNodeType() == Node.ELEMENT_NODE) {
                System.out.println(currentNode.getTextContent());
            }
        }

Upvotes: 2

Related Questions