Reputation: 62674
What is the simplest way to parse the following XML?
<Fruit>
<FruitId>Bannana</FruitId>
<FruitColor>Yellow</FruitColor>
<FruitShape>Moon</FruitShape>
<Customer>
<Name>Joe</Name>
<Numbereaten>5</NumberEaten>
<Weight>2.6</Weight>
</Customer>
<Customer>
<Name>Mark</Name>
<Numbereaten>8</NumberEaten>
<Weight>5.0</Weight>
</Customer>
</Fruit>
<Fruit>
.....
Assuming I have an XML file with multiple and I wanted to extract information such that I can make a csv corresponding to only specific fields (Fruit Id) and all (NUmber eaten and weight pairs, excluding the customer name) How would I accomplish this? Ideally I want to be able to get some data structure or csv that represents the following:
Bannana, 5, 2.6
Bannana, 8, 5
...
Apple 6, 5
Apple 3, 2
I know there is DOMParser and SAXParser for Java, but am wondering if other languages or other means to easily get this information is now available in this day and age now that were in 2013 to simplify this. Or maybe even be able to capture the data in some dictionary data structure that contians some:
Bannana: [5,2.6], [8,5]
Such that it is easily organized in a way thats programatically iteratable and extracted.
Upvotes: 0
Views: 1769
Reputation: 30970
This is a typical use case for XSLT.
The XSLT file would look like this for your example:
<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
<xsl:output method="text" encoding="ISO-8859-1" />
<xsl:variable name="newline" select="'
'"/>
<xsl:template match="Fruit">
<xsl:for-each select="Customer">
<xsl:value-of select="preceding-sibling::FruitId" />
<xsl:text>,</xsl:text>
<xsl:value-of select="Numbereaten" />
<xsl:text>,</xsl:text>
<xsl:value-of select="Weight" />
<xsl:value-of select="$newline" />
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
For transformation you could use this java code:
Source xmlSource = new StreamSource(new File("xmlFile"));
Source xsltSource = new StreamSource(new File("xsltFile"));
Transformer transformer = TransformerFactory.newInstance().newTransformer();
StreamResult csvResult = new StreamResult(new File("file.csv"));
transformer.transform(xmlSource, csvResult);
The benefit of XSLT version is that java code is very short. The XSLT file can stay outside of your code and can be easily adapted when XML file changes.
Upvotes: 1
Reputation: 43
XSLT indeed, or if your examples are very simple and you don't want to "learn" XSLT, I would advise you to
Upvotes: 0
Reputation: 11
I've recently used the SAXParser and it was pretty straight forward and simple to implement.
In particular the XMLReader was very easy to implement and the XMLStreamReader was only slightly more time consuming to implement.
The benefit of the Reader is you can jump from one xml tag to the next and extract the data right there. The StreamReader as a little more setup time but its more flexible.
If I were you I would just read up on the differences between the SAXParser and the DOMParsers and decide which fits your situation best and run with it.
Upvotes: 0