Iterating over XML DOM documents in standard Java

Question

Is there a direct substitute in standard Java (J2SE) 6 (no JDOM or other third party libraries) for the Apache Xerces XML DOM document parsing and iterating classes? Using the Sun J2SE XML implementation, i.e.,

com.sun.org.apache.xerces.internal.dom.DocumentImpl
com.sun.org.apache.xerces.internal.dom.NodeIteratorImpl
com.sun.org.apache.xerces.internal.parsers.DOMParser

causes warnings by the compiler that each use of these classes "is Sun proprietary API and may be removed in a future release".

The code I currently use for instantiating a generic NodeIterator object is

org.w3c.dom.Document document = ...; // Get an XML DOM Document object

org.w3c.dom.NodeIterator iterator = new NodeIteratorImpl(new DocumentImpl(
    document.getDoctype()), document.getDocumentElement(),
    NodeFilter.SHOW_ALL, null, true);

The NodeIterator visits the nodes of an XML DOM document in depth-first search order, which is essential for my application and so the above code needs to be replaced with something functionally equivalent.

Also, would the best (= standard and fastest) way of creating a new org.w3c.dom.Document object be something like

Document = DocumentBuilderFactory.newInstance().newDocumentBuilder.parse(source)

where source is a properly instantiated org.xml.sax.InputSource object?

Ian Roberts · Accepted Answer

The NodeIterator is part of the DOM traversal spec. You can check whether a particular Document supports traversal by testing whether it is an instanceof org.w3c.dom.traversal.DocumentTraversal (the standard implementation in the JRE should be). If so, you can cast it and do

NodeIterator ni = ((DocumentTraversal)document).createNodeIterator(
      document.getDocumentElement(), NodeFilter.SHOW_ALL, null, true);

Iterating over XML DOM documents in standard Java

Answers (1)

Related Questions