A.K
A.K

Reputation: 37

XSLT split single xml into multiple xmls preserving parent elements

I am trying to split a huge xml(around 300MB) into smaller files based on a repeated child element.

Example below to illustrate the scenario

input.xml

<?xml version="1.0" encoding="UTF-8"?>
<a>
   <b></b>
   <bb></bb>
   <bbb>
        <c>
            <d id="1">
                <x></x>
                <y></y>
            </d>
            <d id="2">
                <x></x>
                <y></y>
            </d>
            <d id="3">
                <x></x>
                <y></y>
            </d>
        </c>
    </bbb>
    <e></e>
    <f></f>
</a>

As mentioned above this has a repeated child element . Based on this element, separate output files are expected by keeping its parent elements and attributes intact.

Expected output out_1_a.xml

<?xml version="1.0" encoding="UTF-8"?>
<a>
   <b></b>
   <bb></bb>
   <bbb>
        <c>
            <d id="1">
                <x></x>
                <y></y>
            </d>
        </c>
    </bbb>
    <e></e>
    <f></f>
</a>

Expected output out_2_a.xml

<?xml version="1.0" encoding="UTF-8"?>
<a>
   <b></b>
   <bb></bb>
   <bbb>
        <c>
            <d id="2">
                <x></x>
                <y></y>
            </d>
        </c>
    </bbb>
    <e></e>
    <f></f>
</a>

Expected output out_3_a.xml

<?xml version="1.0" encoding="UTF-8"?>
<a>
   <b></b>
   <bb></bb>
   <bbb>
        <c>
            <d id="3">
                <x></x>
                <y></y>
            </d>
        </c>
    </bbb>
    <e></e>
    <f></f>
</a>

My xsl - sample.xsl

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="2.0">

<xsl:output indent="yes"/>

<xsl:template match="/">
    <xsl:for-each select="a/bbb/c/d">
    <xsl:variable name="i" select="position()" />
    <xsl:result-document method="xml" href="out_{$i}_a.xml">
    <a>
        <b></b>
        <bb></bb>
        <bbb>
            <c>
                <xsl:copy-of select="../@* | ." />
            </c>
        </bbb>
        <e></e>
        <f></f>
    </a>
    </xsl:result-document>
    </xsl:for-each>
</xsl:template> 
</xsl:stylesheet>

This works ok and I get the output that I desire. However, I am sure there is a better way to achieve this than hardcoding those parent elements like a, b, bb etc. Also in some cases these parent elements contains attributes and they are dynamic. So hardcoding is something I want to avoid. Any better way to solve this?

Upvotes: 0

Views: 860

Answers (2)

Rupesh_Kr
Rupesh_Kr

Reputation: 3435

You can use this:

<xsl:template match="d">
    <xsl:variable name="name" select="generate-id()"/>
    <xsl:variable name="outputposition"><xsl:value-of select="count(preceding::d)+1"></xsl:value-of></xsl:variable>
    <xsl:result-document method="xml" href="out_{$outputposition}_a.xml" indent="yes">
        <xsl:call-template name="spilit">
            <xsl:with-param name="name" select="$name"/>
            <xsl:with-param name="element" select="root()"/>
        </xsl:call-template>
    </xsl:result-document>
</xsl:template>

<xsl:template name="spilit">
    <xsl:param name="name"/>
    <xsl:param name="element"/>
    <xsl:for-each select="$element[descendant-or-self::d[generate-id() eq $name]]">
        <xsl:choose>
            <xsl:when test="self::d[generate-id() = $name]">
                <xsl:copy>
                    <xsl:copy-of select="@*"></xsl:copy-of>
                    <xsl:copy-of select="node()"></xsl:copy-of>
                </xsl:copy>
            </xsl:when>
            <xsl:otherwise>
                <xsl:copy-of select="preceding-sibling::*"/>
                <xsl:copy>
                    <xsl:call-template name="spilit">
                        <xsl:with-param name="name" select="$name"/>
                        <xsl:with-param name="element" select="child::*[descendant-or-self::d[generate-id() eq $name]]"/>
                    </xsl:call-template>
                </xsl:copy>
                <xsl:copy-of select="following-sibling::*"/>
            </xsl:otherwise>
        </xsl:choose>
    </xsl:for-each>
</xsl:template>

Upvotes: 2

Linga Murthy C S
Linga Murthy C S

Reputation: 5432

The below XSLT-2.0 solution should do this job easily:

<xsl:stylesheet
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform" 
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
    version="2.0">
    <xsl:output method="xml" indent="yes" omit-xml-declaration="yes"/>

    <xsl:template match="@* | node()" mode="createDoc">
        <xsl:param name="id"/>
        <xsl:copy>
            <!-- apply-templates to the attributes and, the desired 'd' child element or all children elements -->
            <xsl:apply-templates select="@*, if(node()[generate-id() = $id]) then node()[generate-id() = $id] else node()" mode="createDoc">
                <xsl:with-param name="id" select="$id"/>
            </xsl:apply-templates>
        </xsl:copy>
    </xsl:template>

    <!-- template to create documents per `d` element -->
    <xsl:template match="/">
        <xsl:for-each select="a/bbb/c/d">
            <xsl:result-document href="out_{@id}_a.xml">
                <xsl:apply-templates select="root(.)" mode="createDoc">
                    <!-- pass the id of the desired element to be copied omitting its siblings-->
                    <xsl:with-param name="id" select="generate-id()"/>
                </xsl:apply-templates>
            </xsl:result-document>
        </xsl:for-each>
    </xsl:template>

</xsl:stylesheet>

The second template creates a document per d element by passing the generate-id() of the matched element to the recursive template(the first template).

The first template, recursively copies all elements. Also, it uses an xsl:if to copy only the desired d element by its generate-id() and omitting other siblings.

Upvotes: 1

Related Questions