MTALY
MTALY

Reputation: 1782

Split XML file with XSLT 2.0 based on element count

Huge XML with the following structure :

<?xml version="1.0" encoding="UTF-8"?>
<productall>

<product type="electronics" date="1-1-2016">

<type name"Androidbased">

      <product> InStock </product>

</type>      

</product>

<product type="cloths" date="1-12-2008">

<type name"Jeans">

      <product> InStock </product>

</type>      

</product>

<product type="bags" date="1-12-2008">

<type name"FF">

      <product> InStock </product>

</type>      

</product>


</productall>

each product type has thousands of records for example electronics are 2000 records and cloths are 8000 records.

I want to split this XML file into multiple XMLs with 1000 records each regardless the type!

I have used XSLT 2.0 based on java & saxon 9 to split it but it doesn't work as it should here is what I did so far :

java -jar sax.jar productall.xml split.xslt

Split.xslt

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:param name="productall" select="1000"></xsl:param>
    <xsl:template match="/productall/product[@type]">
        <xsl:for-each-group select="product" group-adjacent="(position()-1) idiv $productall">
            <xsl:result-document href="part.{current-grouping-key()}.xml">
                <productall>
                    <xsl:copy-of select="current-group()"></xsl:copy-of>
                </productall>
            </xsl:result-document>
        </xsl:for-each-group>
    </xsl:template>
</xsl:stylesheet>

The result is printed out on the terminal screen without XML format and no .XML files are generated. Don't know what is wrong the command syntax or the contents of the XSLT file?

Upvotes: 0

Views: 908

Answers (2)

Michael Kay
Michael Kay

Reputation: 163625

I think you simply need to change match="/productall/product[@type]" to match="/productall".

Upvotes: 1

Tim C
Tim C

Reputation: 70648

The main problem here is your template matches a product element, which means when you do the xsl:for-each-group, you will be positioned on the product element. Then you are selecting product elements, ye the these are not children of the current element, but of the type elements. So, you need to do this...

<xsl:for-each-group select="type/product" group-adjacent="(position()-1) idiv $productall">

However, you say you want multiple XMLs with 1000 records each regardless the type, but the current XSLT does this for each main product separately, meaning you will get duplicate file names.

Perhaps you should include the main product type in the file name?

<result-document href="part.{../../@type}.{current-grouping-key()}.xml">

Or if you really did want to do it regardless of main product type, you should change your main template to match productall instead.

Try this XSLT

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:param name="productall" select="1000"></xsl:param>
    <xsl:template match="/productall">
         <xsl:for-each-group select="product/type/product" group-adjacent="(position()-1) idiv $productall">
            <xsl:result-document href="part.{current-grouping-key()}.xml">
                <productall>
                    <xsl:copy-of select="current-group()"></xsl:copy-of>
                </productall>
            </xsl:result-document>
        </xsl:for-each-group>
    </xsl:template>
</xsl:stylesheet>

Upvotes: 1

Related Questions