KIC
KIC

Reputation: 6121

How to get all the sub tags as list using XmlSlurper

I want to read all the variable names from an xml file. However while I would expecxt some sort of List I get all the "Names" glued together. How an I get all the Names in a List?

Xml:

<DataSet>
    <Version>1.0</Version>
    <DataSupplier>
    </DataSupplier>
    <Media>
        <Name />
        <Table>
            <URL>Sachkontenstamm.csv</URL>
            <Name>Sachkontenplan</Name>
            <DecimalSymbol>,</DecimalSymbol>
            <DigitGroupingSymbol />
            <VariableLength>
                <VariableColumn>
                    <Name>Ktonr</Name>
                    <Description>Kontonummer des Kontos</Description>
                    <Numeric />
                </VariableColumn>
                <VariableColumn>
                    <Name>Text</Name>
                    <Description>Beschriftung</Description>
                    <AlphaNumeric />
                    <MaxLength>40</MaxLength>
                </VariableColumn>
                ...
      </VariableLength>
              </Table>   
     </Media>
</DataSet>

groovy:

def indexFile = new XmlSlurper().parse(new File("src/main/resources/index.xml"))

indexFile
        .'**'
        .findAll { it?.URL == "Sachkontenstamm.csv" }
        .VariableLength
        .VariableColumn
        .Name

Upvotes: 1

Views: 1260

Answers (2)

chriopp
chriopp

Reputation: 957

I take it there is only one table for the queried URL. If that is the case you can access the names using find like this:

def names = new XmlSlurper().parseText(xml)
    .'**'
    .find { it?.URL == "Sachkontenstamm.csv" }.VariableLength.VariableColumn
    .collect { it.Name }

// Result: [Ktonr, Text]

If there are multiple occurrences of the table with the given URL stick with findAll:

names = new XmlSlurper().parseText(xml)
    .'**'
    .findAll { it?.URL == "Sachkontenstamm.csv" }
    .collect { it.VariableLength.VariableColumn.collect { it.Name } }

// Result: [[Ktonr, Text]]

If the data of those multiple tables can be mixed up you can apply flatten on the result:

names.flatten()

// Result: [Ktonr, Text]

Upvotes: 1

Szymon Stepniak
Szymon Stepniak

Reputation: 42184

If you want to extract VariableColumn.Name correctly you need to collect all children nodes of VariableLength. In example you have shown above parser squashes path result to a single node. You can fix it by adding collect operation to VariableLength children and extracting information you are interested about. Consider following example:

def xml = '''<DataSet>
    <Version>1.0</Version>
    <DataSupplier>
    </DataSupplier>
    <Media>
        <Name />
        <Table>
            <URL>Sachkontenstamm.csv</URL>
            <Name>Sachkontenplan</Name>
            <DecimalSymbol>,</DecimalSymbol>
            <DigitGroupingSymbol />
            <VariableLength>
                <VariableColumn>
                    <Name>Ktonr</Name>
                    <Description>Kontonummer des Kontos</Description>
                    <Numeric />
                </VariableColumn>
                <VariableColumn>
                    <Name>Text</Name>
                    <Description>Beschriftung</Description>
                    <AlphaNumeric />
                    <MaxLength>40</MaxLength>
                </VariableColumn>
            </VariableLength>
        </Table>   
     </Media>
</DataSet>
'''

def indexFile = new XmlSlurper().parseText(xml)

def result = indexFile.'**'
        .findAll { it?.URL == "Sachkontenstamm.csv" }
        .collect { it.VariableLength.'*'.findAll { node -> node.name() == 'VariableColumn' }*.Name*.text() }
        .flatten()

assert result == ['Ktonr', 'Text']

Hope it helps.

Upvotes: 0

Related Questions