Rodders
Rodders

Reputation: 2435

XSLT Split XML file using Skip and Take like approach

Our orders come in as XML, one of our customers is now sending in huge orders for christmas (25mb xml files), this pretty much grinds our system to a halt! I need to find a decent way to split this file into several files of x size or y number of orders. I can do this with a simple console app but I am wondering if I it can be done with XSLT?

I want to select elements say 2 at a time for this example. Is there a kind of skip and take style method for this?

Example order file:

<order>
    <customerName>Customer 1</customerName>
    <orderID>001</orderID>
    <orderItem>
        <itemID>001</itemID>
        <quantity>12</quantity>
    </orderItem>
    <orderItem>
        <itemID>002</itemID>
        <quantity>15</quantity>
    </orderItem>
    <orderItem>
        <itemID>003</itemID>
        <quantity>120</quantity>
    </orderItem>
    <orderItem>
        <itemID>004</itemID>
        <quantity>1223</quantity>
    </orderItem>
    <orderItem>
        <itemID>005</itemID>
        <quantity>22</quantity>
    </orderItem>
    <orderItem>
        <itemID>006</itemID>
        <quantity>78</quantity>
    </orderItem>
</order>

I want to split it into XML documents containing 2 orderItems each:

File 1:

<order>
    <customerName>Customer 1</customerName>
    <orderID>001</orderID>
    <orderItem>
        <itemID>001</itemID>
        <quantity>12</quantity>
    </orderItem>
    <orderItem>
        <itemID>002</itemID>
        <quantity>15</quantity>
    </orderItem>
</order>

File 2:

<order>
    <customerName>Customer 1</customerName>
    <orderID>001</orderID>
    <orderItem>
        <itemID>003</itemID>
        <quantity>120</quantity>
    </orderItem>
    <orderItem>
        <itemID>004</itemID>
        <quantity>1223</quantity>
    </orderItem>
</order>

File 3:

<order>
    <customerName>Customer 1</customerName>
    <orderID>001</orderID>
    <orderItem>
        <itemID>005</itemID>
        <quantity>22</quantity>
    </orderItem>
    <orderItem>
        <itemID>006</itemID>
        <quantity>78</quantity>
    </orderItem>
</order>

Upvotes: 1

Views: 213

Answers (2)

Peter
Peter

Reputation: 1796

If you apply this XSLT

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">

<xsl:strip-space elements="*"/>
<xsl:output indent="yes" method="xml"/>

<xsl:template match="node()|@*">
    <xsl:copy>
        <xsl:apply-templates select="node()|@*"/>
    </xsl:copy>
</xsl:template>

<!-- split file up into order elements -->
<xsl:template match="orderItem[position() mod 2 = 1]">
    <order>
        <xsl:copy-of select=".|following-sibling::orderItem[not(position() > 1)]"/>
    </order>
</xsl:template>

<xsl:template match="orderItem"/>

</xsl:stylesheet>

to your source XML you get this output XML:

<?xml version="1.0" encoding="UTF-8"?>
<order>
<customerName>Customer 1</customerName>
<orderID>001</orderID>
<order>
    <orderItem>
        <itemID>001</itemID>
        <quantity>12</quantity>
    </orderItem>
    <orderItem>
        <itemID>002</itemID>
        <quantity>15</quantity>
    </orderItem>
</order>
<order>
    <orderItem>
        <itemID>003</itemID>
        <quantity>120</quantity>
    </orderItem>
    <orderItem>
        <itemID>004</itemID>
        <quantity>1223</quantity>
    </orderItem>
</order>
<order>
    <orderItem>
        <itemID>005</itemID>
        <quantity>22</quantity>
    </orderItem>
    <orderItem>
        <itemID>006</itemID>
        <quantity>78</quantity>
    </orderItem>
</order>
</order>

The question is if you system supports this split. The one I am working with supports it and adds <?xml version="1.0" encoding="UTF-8"?> automatically so the split orders would be seperate XMLs and those get processed one by one.

Upvotes: 0

Dimitre Novatchev
Dimitre Novatchev

Reputation: 243529

My understanding is that you are limited to XSLT 1.0.

In this case you can use the MVP-XML project and its EXSLT.NET component.

More specifically, you will use the <exsl:document> extension element to generate multiple output from the same transformation.

Upvotes: 2

Related Questions