Rolando
Rolando

Reputation: 62674

Convert XML to other format?

What is the simplest way to parse the following XML?

<Fruit>
  <FruitId>Bannana</FruitId>
  <FruitColor>Yellow</FruitColor>
  <FruitShape>Moon</FruitShape>
  <Customer>
     <Name>Joe</Name>
     <Numbereaten>5</NumberEaten>
     <Weight>2.6</Weight>
  </Customer>
  <Customer>
     <Name>Mark</Name>
     <Numbereaten>8</NumberEaten>
     <Weight>5.0</Weight>
  </Customer>
</Fruit>
<Fruit>

.....

Assuming I have an XML file with multiple and I wanted to extract information such that I can make a csv corresponding to only specific fields (Fruit Id) and all (NUmber eaten and weight pairs, excluding the customer name) How would I accomplish this? Ideally I want to be able to get some data structure or csv that represents the following:

Bannana, 5, 2.6
Bannana, 8, 5
...
Apple 6, 5
Apple 3, 2

I know there is DOMParser and SAXParser for Java, but am wondering if other languages or other means to easily get this information is now available in this day and age now that were in 2013 to simplify this. Or maybe even be able to capture the data in some dictionary data structure that contians some:

Bannana: [5,2.6], [8,5]

Such that it is easily organized in a way thats programatically iteratable and extracted.

Upvotes: 0

Views: 1769

Answers (3)

Knut Herrmann
Knut Herrmann

Reputation: 30970

This is a typical use case for XSLT.

The XSLT file would look like this for your example:

<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    version="1.0">
    <xsl:output method="text" encoding="ISO-8859-1" />
    <xsl:variable name="newline" select="'&#xA;'"/>
    <xsl:template match="Fruit">
        <xsl:for-each select="Customer">
            <xsl:value-of select="preceding-sibling::FruitId" />
            <xsl:text>,</xsl:text>
            <xsl:value-of select="Numbereaten" />
            <xsl:text>,</xsl:text>
            <xsl:value-of select="Weight" />
            <xsl:value-of select="$newline" />
        </xsl:for-each>
    </xsl:template>
</xsl:stylesheet>

For transformation you could use this java code:

   Source xmlSource = new StreamSource(new File("xmlFile"));
   Source xsltSource = new StreamSource(new File("xsltFile"));
   Transformer transformer = TransformerFactory.newInstance().newTransformer();
   StreamResult csvResult = new StreamResult(new File("file.csv"));
   transformer.transform(xmlSource, csvResult);

The benefit of XSLT version is that java code is very short. The XSLT file can stay outside of your code and can be easily adapted when XML file changes.

Upvotes: 1

gbataille
gbataille

Reputation: 43

XSLT indeed, or if your examples are very simple and you don't want to "learn" XSLT, I would advise you to

  • use JAXB to create an object easily from your XML
  • then output your object as you see fit to a file

Upvotes: 0

Tobes
Tobes

Reputation: 11

I've recently used the SAXParser and it was pretty straight forward and simple to implement.

In particular the XMLReader was very easy to implement and the XMLStreamReader was only slightly more time consuming to implement.

The benefit of the Reader is you can jump from one xml tag to the next and extract the data right there. The StreamReader as a little more setup time but its more flexible.

If I were you I would just read up on the differences between the SAXParser and the DOMParsers and decide which fits your situation best and run with it.

Upvotes: 0

Related Questions