Polyphase29
Polyphase29

Reputation: 503

Using Java to get specific child nodes of XML tag

I'm trying to parse data from the dependencies in a Wildfly POM file. I only want the dependencies listed in the <dependencyManagement> tag, ex:

<dependencyManagement>
<dependencies>
<!--  Modules in this project  -->
<dependency>
<groupId>org.wildfly</groupId>
<artifactId>wildfly-appclient</artifactId>
<version>${project.version}</version>
</dependency>
<dependency>
<groupId>org.wildfly</groupId>
<artifactId>wildfly-arquillian-common</artifactId>
<version>${project.version}</version>
</dependency>

I know I can use the following to get the dependencyManagement:

        final NodeList dependenciesList = doc.getElementsByTagName("dependencyManagement");

But I would like to avoid having to use many for loops to then get the dependencies child, then loop through that to get each individual dependency. Is there a way to achieve this? Or would I need to rely on loops to go through dependencies and then each dependency?

edit: I'm attempting something like this, but doesnt seem to give results when I try to iterate through my dependencies:

        final Node dependencyManagement = doc.getElementsByTagName("dependencyManagement").item(0);
        final Node deps = dependencyManagement.getFirstChild();
        final NodeList dependenciesList = deps.getChildNodes();

Upvotes: 1

Views: 3998

Answers (2)

Anton Kumpan
Anton Kumpan

Reputation: 344

Take a look at XPath:

It is the most common way to extract data from XML/HTML.

With help of XPath query language you can quickly navigate through XML tags you want.

For example for your case: start your XPath from '/dependencyManagement' and it will only consider elements under 'dependencyManagement' section.

Code to navigate through every 'artifactId' inside 'dependency' tag:

import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.NodeList;
import org.xml.sax.SAXException;

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.xpath.XPath;
import javax.xml.xpath.XPathConstants;
import javax.xml.xpath.XPathExpressionException;
import javax.xml.xpath.XPathFactory;
import java.io.File;
import java.io.IOException;
import java.net.URL;


public class Test {
    public static void main(String[] args) throws ParserConfigurationException, IOException, SAXException, XPathExpressionException {
        URL url = Test.class.getClassLoader().getResource("testfile.xml");

        DocumentBuilderFactory builderFactory = DocumentBuilderFactory.newInstance();
        DocumentBuilder builder = builderFactory.newDocumentBuilder();
        Document xmlDocument = builder.parse(new File(url.getFile()));
        XPath xPath = XPathFactory.newInstance().newXPath();
        String expression = "/dependencyManagement//dependency//artifactId";
        NodeList nodes = (NodeList) xPath.evaluate(expression, xmlDocument, XPathConstants.NODESET);
        for (int i = 0; i < nodes.getLength(); ++i) {
            Element e = (Element) nodes.item(i);
            System.out.println(e.getTextContent());
        }
    }
}  

Code above producing: result

Upvotes: 2

Andreas
Andreas

Reputation: 159086

There are 2 ways of finding XML elements by name.

You're using getElementsByTagName() method of the Document object, which searches the entire XML document.

The Element object also has a getElementsByTagName() method, which only searches the subtree of that element.

You can also use XPath for more advanced expressions.

Here is an example using both:

DocumentBuilder domBuilder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
Document document = domBuilder.parse(new File("pom.xml"));

XPath xPath = XPathFactory.newInstance().newXPath();
String expr = "/project/dependencyManagement/dependencies/dependency";
XPathNodes result = xPath.evaluateExpression(expr, document, XPathNodes.class);

for (Node node : result) {
    Element elem = (Element) node;
    Node artifactIdNode = elem.getElementsByTagName("artifactId").item(0);
    String artifactId = artifactIdNode.getTextContent();
    System.out.println(artifactId);
}

pom.xml

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>
    <groupId>test</groupId>
    <artifactId>test</artifactId>
    <version>0.0.1-SNAPSHOT</version>
    <dependencyManagement>
        <dependencies>
            <dependency>
                <groupId>com.fasterxml.jackson.core</groupId>
                <artifactId>jackson-databind</artifactId>
                <version>2.10.3</version>
            </dependency>
            <dependency>
                <groupId>com.fasterxml.jackson.datatype</groupId>
                <artifactId>jackson-datatype-jdk8</artifactId>
                <version>2.10.3</version>
            </dependency>
        </dependencies>
    </dependencyManagement>
    <dependencies>
        <dependency>
            <groupId>com.fasterxml.jackson.datatype</groupId>
            <artifactId>jackson-datatype-jsr310</artifactId>
            <version>2.10.3</version>
        </dependency>
    </dependencies>
</project>

Output

jackson-databind
jackson-datatype-jdk8

As you can see, the jackson-datatype-jsr310 is not included in the result, because XPath didn't look there.

Upvotes: 2

Related Questions