jktravis
jktravis

Reputation: 1568

SaxonHE 9 to create multiple files with XSLT from a very large file

I'm using the information from Splitting XML into multiple files with XSLT to split an XML file that's 143M in size. If I manually take a handful of records out of the file, the following template works as suggested from the above link.

    <xsl:template match="/">
        <xsl:for-each select="Report_Data/Report_Entry">
            <xsl:result-document method="xml" href="record-{position()}.xml">
                <xsl:copy-of select="."/>
            </xsl:result-document>
        </xsl:for-each>
    </xsl:template>

My problem seems to be when I apply the XSLT to the larger document, which doesn't create the files, and outputs only the xml header when no output is provided when the files are created.

$ java -Xmx512M -jar /usr/local/bin/saxon9he.jar largefile.xml transform.xsl
<?xml version="1.0" encoding="UTF-8"?>

I'm working in Cygwin and using 32 bit Java v1.7.0_55.

Adding the -t option results in the following output:

Saxon-HE 9.6.0.5J from Saxonica
Java version 1.7.0_55
Stylesheet compilation time: 609.975948ms
Processing file:/C:/Users/username/Documents/Projects/xml/largefile.xml
Using parser com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser
Building tree for file:/C:/Users/username/Documents/Projects/largefile.xml using class net.sf.saxon.tree.tiny.TinyBuilder
Tree built in 5.85596s (5855.960358ms)
Tree size: 6942834 nodes, 55451426 characters, 0 attributes
<?xml version="1.0" encoding="UTF-8"?>Execution time: 5.913265s (5913.265026ms)
Memory used: 402449896
NamePool contents: 40 entries in 37 chains. 8 URIs

Is the file just too large for the HE version of Saxon? Is there some other setting or reason that I'm getting output, rather than a collection of files?

Upvotes: 2

Views: 723

Answers (1)

Martin Honnen
Martin Honnen

Reputation: 167696

If there are no files created and you don't get any out of memory error message then I assume your path Report_Data/Report_Entry does not select anything, the main reason for that is usually a namespace declaration in the input file, e.g. <Report_Data xmlns="http://example.com/"><Report_Entry>...</Report_Entry></Report_Data>. The easiest fix in XSLT 2.0 is to put xpath-default-namespace="http://example.com/" on the xsl:stylesheet or xsl:transform element, then you don't need to change any paths in the stylesheet code you have posted.

Upvotes: 3

Related Questions