user3436925
user3436925

Reputation:

XML to CSV conversion

I have a scenario where I need to convert the input XML to a CSV file. The output should have values for every attribute with their respective XPATH. For example: If my input is

<School>
    <Class>
        <Student name="" class="" rollno="" />
        <Teacher name="" qualification="" Employeeno="" />
    </Class>
</School>

The expected output would be:

School/Class/Student/name, School/Class/Student/class, School/Class/Student/rollno, 
School/Class/Teacher/name, School/Class/Teacher/qualification, School/Class/Teacher/Employeeno  

Upvotes: 0

Views: 144

Answers (1)

michael.hor257k
michael.hor257k

Reputation: 116959

An example does not always embody a rule. Assuming you want a row for each element that has any attributes, no matter where in the document it is, and a column for each attribute of an element, try:


Edit:

This is an improved version, corrected to work properly with nested elements.

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> 
<xsl:output method="text" encoding="UTF-8"/>

<xsl:template match="*">
    <xsl:param name="path" />
    <xsl:variable name="newpath" select="concat($path, '/', name())" />
    <xsl:apply-templates select="@*">
        <xsl:with-param name="path" select="$newpath"/>
    </xsl:apply-templates>
    <xsl:if test="@*">
        <xsl:text>&#10;</xsl:text>
    </xsl:if>
    <xsl:apply-templates select="*">
        <xsl:with-param name="path" select="$newpath"/>
    </xsl:apply-templates>
</xsl:template>

<xsl:template match="@*">
    <xsl:param name="path" />
    <xsl:value-of select="substring(concat($path, '/', name()), 2)"/>
    <xsl:if test="position()!=last()">
        <xsl:text>, </xsl:text>
    </xsl:if>
</xsl:template>

</xsl:stylesheet>

When applied to the following test input:

<Root>
    <Parent parent="1" parent2="1b">
        <Son son="11" son2="11b"/>
        <Daughter daughter="12" daughter2="12b">
            <Grandson grandson="121" grandson2="121b"/>
            <Granddaughter granddaughter="122" granddaughter2="122b"/>
        </Daughter>
        <Sibling/>
    </Parent>
</Root>

the result is:

Root/Parent/parent, Root/Parent/parent2
Root/Parent/Son/son, Root/Parent/Son/son2
Root/Parent/Daughter/daughter, Root/Parent/Daughter/daughter2
Root/Parent/Daughter/Grandson/grandson, Root/Parent/Daughter/Grandson/grandson2
Root/Parent/Daughter/Granddaughter/granddaughter, Root/Parent/Daughter/Granddaughter/granddaughter2

Note that the number of columns in each row can vary - this is often unacceptable in a CSV document.

Upvotes: 2

Related Questions