user301016
user301016

Reputation: 2237

How to Optimize XSLT for Referencing data in other XMLs

In an input XML file, along with Static Columns, columns expecting data from other files (reference)is also available. But for each reference, the input xml has separate row with same ID or UID.

The output file has to have all references and relations in one row (based on the ID or UID)

I wrote the XSLT for this transformation also. This XSLT is faster when the row count is less (< 100 or < 200). But, as the count grows, the output xml generation taking long time (for count of 1000 rows, around 30 mins).

I am using

<xsl:for-each select="z:row/@ID[generate-id() = generate-id(key('UniqueID',.))]">

in the XSLT. Because for the same ID in each row of input xml, it has to check for multiple references (like section) and relations (like Child) and populate the same as columns

Input Raw XML File.

<xml xmlns:dt="uuid:C2F41010-65B3-11d1-A29F-00AA00C14882" xmlns:rs="urn:schemas-microsoft-com:rowset" xmlns:s="uuid:BDC6E3F0-6DA3-11d1-A2A3-00AA00C14882" xmlns:z="#RowsetSchema">
<rs:data>
    <z:row UID="PARENT_001_1221AD_A878" GroupID="" GroupRel="" ID="37" Name="Outer Asset Details" RelProduct="Line1" RelUID="CHILD1_101_9899_9POOU99" RelName="CHILD1" RelType="Child" Size="22"/>
    <z:row UID="PARENT_001_1221AD_A878" GroupID="" GroupRel="" ID="37" Name="Outer Asset Details" RelProduct="Line1" RelUID="CHILD2_201_5646546_9890PBS" RelName="CHILD1" RelType="Child" Size="22"/>
    <z:row UID="PARENT_001_1221AD_A878" GroupID="" GroupRel="" ID="37" Name="Outer Asset Details" RelProduct="Line1" RelUID="SEC_999_99565_998AFSD" RelName="Hydraulic Section" RelType="Section" Size="22"/>
</rs:data>

Child.xml

<Child xsi:noNamespaceSchemaLocation="../XSD/Child.xsd" FILE="Child" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<Row UID="CHILD1_101_9899_9POOU99">
    <Name>CHILD1</Name>
    <Description>This has details about the Hydraulic sections of the automobile</Description>
</Row>
<Row UID="CHILD2_201_5646546_9890PBS">
    <Name>CHILD2</Name>
    <Description>This has details about the manual sections of the automobile</Description>
</Row>

Section.xml

<Section xsi:noNamespaceSchemaLocation="../XSD/Section.xsd" FILE="Section" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<Row UID="SEC_999_99565_998AFSD">
    <Name>Hydraulic Section</Name>
    <Description>This has details about the Sections in which the Hydraulic Systems are used.</Description>
</Row>

XSLT File

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:s="uuid:BDC6E3F0-6DA3-11d1-A2A3-00AA00C14882" xmlns:dt="uuid:C2F41010-65B3-11d1-A29F-00AA00C14882" xmlns:rs="urn:schemas-microsoft-com:rowset" xmlns:z="#RowsetSchema" exclude-result-prefixes="s dt z rs msxsl" xmlns:msxsl="urn:schemas-microsoft-com:xslt">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes" omit-xml-declaration="yes"/>
<xsl:key name="UniqueID" match="z:row/@ID" use="."/>
<xsl:template match="/">
    <Parent xsi:noNamespaceSchemaLocation="../XSD/Parent.xsd" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" FILE="Parent">
        <xsl:for-each select="xml">
            <xsl:apply-templates select="rs:data"/>
        </xsl:for-each>
    </Parent>
</xsl:template>
<xsl:template match="rs:data">
    <xsl:for-each select="z:row/@ID[generate-id() = generate-id(key('UniqueID',.))]">
        <xsl:variable name="FRId">
            <xsl:value-of select="current()"/>
        </xsl:variable>
        <xsl:variable name="curNSet" select="//z:row[@ID=$FRId]"/>
        <xsl:copy-of select="current()"/>
        <Record>
            <xsl:attribute name="UID"><xsl:value-of select="$curNSet/@UID"/></xsl:attribute>
            <xsl:element name="Size">
                <xsl:value-of select="$curNSet/@Size"/>
            </xsl:element>
            <xsl:element name="Child">
                <xsl:apply-templates select="$curNSet[@RelType='Child']" mode="Relations">
                    <xsl:with-param name="RelType" select="'Child'"/>
                    <xsl:with-param name="DstFileName" select="'../Files/Child.xml'"/>
                </xsl:apply-templates>
            </xsl:element>
            <xsl:element name="Section">
                <xsl:apply-templates select="$curNSet[@RelType='Section']" mode="References">
                    <xsl:with-param name="RelType" select="'Section'"/>
                    <xsl:with-param name="DstFileName" select="'../Files/Section.xml'"/>
                </xsl:apply-templates>
            </xsl:element>
        </Record>
    </xsl:for-each>
</xsl:template>
<xsl:template match="z:row" mode="Relations">
    <xsl:param name="RelType"/>
    <xsl:param name="DstFileName"/>
    <xsl:element name="{$RelType}">
        <xsl:attribute name="DestinationKey"><xsl:value-of select="@RelUID"/></xsl:attribute>
        <xsl:attribute name="RelFilePath"><xsl:value-of select="$DstFileName"/></xsl:attribute>
        <xsl:attribute name="SequenceNumber"><xsl:value-of select="position()"/></xsl:attribute>
        <xsl:value-of select="@RelName"/>
    </xsl:element>
</xsl:template>
<xsl:template match="z:row" mode="References">
    <xsl:param name="DstFileName"/>
    <xsl:attribute name="DestinationKey"><xsl:value-of select="@RelUID"/></xsl:attribute>
    <xsl:attribute name="RelFilePath"><xsl:value-of select="$DstFileName"/></xsl:attribute>
    <xsl:attribute name="SequenceNumber"><xsl:value-of select="position()"/></xsl:attribute>
    <xsl:value-of select="@RelName"/>
</xsl:template>

Output.xml

<Parent xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="../XSD/Parent.xsd" FILE="Parent" ID="37">
<Record UID="PARENT_001_1221AD_A878">
    <Size>22</Size>
    <Child>
        <Child DestinationKey="CHILD1_101_9899_9POOU99" RelFilePath="../Files/Child.xml" SequenceNumber="1">CHILD1</Child>
        <Child DestinationKey="CHILD2_201_5646546_9890PBS" RelFilePath="../Files/Child.xml" SequenceNumber="2">CHILD1</Child>
    </Child>
    <Section DestinationKey="SEC_999_99565_998AFSD" RelFilePath="../Files/Section.xml" SequenceNumber="1">Hydraulic Section</Section>
</Record>

Please help me in optimizing the XSLT, so that the output file is generated faster

Upvotes: 0

Views: 324

Answers (1)

Martin Honnen
Martin Honnen

Reputation: 167696

Consider to use

<xsl:key name="UniqueID" match="z:row" use="@ID"/>

then

<xsl:for-each select="z:row/@ID[generate-id() = generate-id(key('UniqueID',.))]">
        <xsl:variable name="FRId">
            <xsl:value-of select="current()"/>
        </xsl:variable>
        <xsl:variable name="curNSet" select="//z:row[@ID=$FRId]"/>

can be replaced with

<xsl:for-each select="z:row[generate-id() = generate-id(key('UniqueID', @ID))]">
        <xsl:variable name="FRId" select="@ID"/>

        <xsl:variable name="curNSet" select="key('UniqueID', @ID"/>

I am not sure you need the variable FRId at all but defining it with a select attribute instead of a nested value-of is certainly consuming less resources.

To make

 <xsl:apply-templates select="$curNSet[@RelType='Child']" mode="Relations">

more efficient define a key

<xsl:key name="rel" match="z:row" use="concat(@ID, '|', @RelType)"/>

then use

 <xsl:apply-templates select="key('rel', concat(@ID, '|', 'Child')" mode="Relations">

Then use the same approach for the other apply-templates.

All of the above is untested but should give you an idea.

Upvotes: 1

Related Questions