Ethan Zhang
Ethan Zhang

Reputation: 59

Scripting: How to use XMLStarlet to extract values in pom.xml files

I am trying to accomplish the following tasks using cygwin:

Extract the parent and dependency version with the ID of "IC_Maven_AB_Parent" and "IC_Maven_DE_Parent".

(This is the core part of my task, which I will recursively apply to thousands of pom.xml files)

The input:

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
http://maven.apache.org/maven-v4_0_0.xsd">

<modelVersion>4.0.0</modelVersion>
<groupId>as.ic.ecommerce</groupId>
<artifactId>ASB_Ecommerce</artifactId>
<packaging>pom</packaging>
<version>1.5.8-SNAPSHOT</version>
<parent>
    <groupId>as.ic.maven</groupId>
    <artifactId>IC_Maven_AB_Parent</artifactId>
    <version>7.2</version>
</parent>
<dependencyManagement>
    <dependency>
        <groupId>as.ic.maven</groupId>
        <artifactId>IC_Maven_External_Dependencies</artifactId>
        <version>${ic.external.dep.version}</version>
    </dependency>
    <dependency>
        <groupId>as.ic.maven</groupId>
        <artifactId>IC_Maven_DE_Parent</artifactId>
        <version>6.2</version>
    </dependency>
</dependencyManagement>
</project>

The script I use:

#!/bin/bash
grep -q 'IC_Maven_AB_Parent\|IC_Maven_DE_Parent' pom.xml
if [ "$?" -eq 0 ]; then
xmlstarlet sel -N x=http://maven.apache.org/POM/4.0.0 -t -v /x:project/x:parent/x:artifactId -o "......" -v /x:project/x:parent/x:version pom.xml
fi

The output I have, which does not include the dependency of "IC_Maven_DE_Parent":

IC_Maven_AB_Parent......7.2

And here is what I expect to have as output:

IC_Maven_AB_Parent......7.2
IC_Maven_DE_Parent......6.2

So the key challenge is that I am having problem identify the second dependency, any comment?

Upvotes: 2

Views: 1915

Answers (1)

Birei
Birei

Reputation: 36272

Try following xmlstarlet expression. It does two xpath expressions to reach each <artifactId> element and concat its text with the text of following sibling:

xmlstarlet sel \
    -N x=http://maven.apache.org/POM/4.0.0 -t \
    -m '/x:project/x:parent/x:artifactId[text() = "IC_Maven_AB_Parent"]' \
      -v 'concat(text(), "......", ./following-sibling::x:version)' \
      -n \
    -m '/x:project/x:dependencyManagement/x:dependency/x:artifactId[text() = "IC_Maven_DE_Parent"]' \
      -v 'concat(text(), "......", ./following-sibling::x:version)' \
      -n \
xmlfile

It yields:

IC_Maven_AB_Parent......7.2
IC_Maven_DE_Parent......6.2

Upvotes: 4

Related Questions