SANTHOSH_L
SANTHOSH_L

Reputation: 173

Obtaining Xpath of all nodes in an xml

I have an xml. I want to obtain/print the Xpath(complete) of all the nodes in it using Java. I'm trying to use a DOM parser.

File stocks = new File("File Name");
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder(); 
Document doc = dBuilder.parse(stocks); 
System.out.println("Parsed successfully"); 
doc.getDocumentElement();        
System.out.println("root of xml file : " + doc.getDocumentElement().getNodeName());

I'm able to get my root node to print, But not its children.

Upvotes: 1

Views: 2095

Answers (1)

ug_
ug_

Reputation: 11440

Strangely enough I just wrote a method that could be used for this. This however isn't fully namespace aware, so be warned, it also only works for ELEMENT types.


For this method to work you also need your document to be namespace aware. dbFactory.setNamespaceAware(true);. If you can't have it namespace aware then replace everywhere you see getLocalName() to getTagName().


try {
    XPath xpath = XPathFactory.newInstance().newXPath();
    // get all nodes in the document
    NodeList nList = (NodeList) xpath.evaluate("//*", doc.getDocumentElement() ,XPathConstants.NODESET);

    for(int i=0;i<nList.getLength();i++) {
        if(nList.item(i).getNodeType() == Node.ELEMENT_NODE)
            System.out.println(getElementXPath((Element)nList.item(i), doc.getDocumentElement()));
    }
} catch (XPathExpressionException e) {
    // TODO Auto-generated catch block
    e.printStackTrace();
}


 /**
 * Finds the xPath relative to the given node, the relativeTo should always be a parent of elt
 * @param elt 
 * @param relativeTo should be a parent of elt, if it isnt the path from the document root will be returned
 * @return
 */
public static String getElementXPath(Element elt, Element relativeTo) {
    String path = ""; 

    do {
        String xname = elt.getLocalName() + "[" + getElementIndex(elt) + "]";
        path = "/" + xname + path;

        if(elt.getParentNode() != null && elt.getParentNode().getNodeType() == Element.ELEMENT_NODE)
            elt = (Element) elt.getParentNode();
        else
            elt = null;
    } while(elt != null && !elt.equals(relativeTo));

    return path;                            
}

/**
 * @param original
 * @return the index this element is among its siblings, only accounts for siblings with the same tag name as itself. Used for xpath indexing
 */
private static int getElementIndex(Element original) {
    int count = 1;

    for (Node node = original.getPreviousSibling(); node != null; node = node.getPreviousSibling()) {
        if (node.getNodeType() == Node.ELEMENT_NODE) {
            Element element = (Element) node;
            if (element.getLocalName().equals(original.getLocalName()) && 
                    (element.getNamespaceURI() == original.getNamespaceURI() || (element.getNamespaceURI() != null && element.getNamespaceURI().equals(original.getNamespaceURI())))) {
                count++;
            }
        }
    }

    return count;
}

Upvotes: 3

Related Questions