Reputation: 11
Issue summary : While adding values to hashmap after transforming xslt into templates using saxon 9.6 HE library , the Heap allocation is growing upto 330MB which is almost 70% of the heap(Xmx512 and Xms32). When more items get added to cart it tips the 512 mark and goes OOM generating phd and javacore files.
What we tried : When we used Saxon 9.9 HE version it saved around 30 MB in the overall heap but it still is at 300 MB overall
Goal :
1) Goal is to reduce the memory footprint.
2) Is there any fine tuning as per saxon libraries to reduce this huge heap for the transformed objects
3) We wouldn't want to remove those hashmaps from memory as those templates are needed for faster printing at the end of a cart transaction (like in a point of sale system) - hence we haven't used the getUnderlyingController.clearDocumentPool() in saxon;
Code details :
Saxon initialization in constructor
package com.device.jpos.posprinter.receipts;
import java.util.HashMap;
import javax.xml.transform.OutputKeys;
import javax.xml.transform.Templates;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerConfigurationException;
import javax.xml.transform.TransformerFactory;
import net.sf.saxon.TransformerFactoryImpl;
import com.device.jpos.posprinter.receipts.saxon.SaxonFunctions;
public class ReceiptXSLTTemplateManager
{
private HashMap<String, String> xsltTemplates = null;
private HashMap<String, Templates> xsltTransTemplates = null;
private TransformerFactory transformerFact = null;
public ReceiptXSLTTemplateManager( String xsltProcessor )
{
this.xsltProcessor = xsltProcessor;
setTransformerFactory();
xsltTemplates = new HashMap<String, String>();
xsltTransTemplates = new HashMap<String, Templates>();
// create an instance of TransformerFactory
transformerFact = javax.xml.transform.TransformerFactory.newInstance();
if ( transformerFact instanceof TransformerFactoryImpl )
{
TransformerFactoryImpl tFactoryImpl = (TransformerFactoryImpl) transformerFact;
net.sf.saxon.Configuration saxonConfig = tFactoryImpl.getConfiguration();
SaxonFunctions.register( saxonConfig );
}
}
}
Transformation of xslt and adding to hashmap
public boolean setTransformer( String name )
{
if ( xsltTemplates.containsKey( name ) )
{
StringReader xsltReader = new StringReader( xsltTemplates.get( name ) );
javax.xml.transform.Source xsltSource = new javax.xml.transform.stream.StreamSource( xsltReader );
try
{
Templates transTmpl = transformerFact.newTemplates( xsltSource );
xsltTransTemplates.put( name, transTmpl );
return true;
}
catch ( TransformerConfigurationException e )
{
logger.error( String.format( "Error creating XSLT transformer for receipt type = %s.", name ) );
}
}
else
{
logger.error( String.format( "Error creating XSLT transformer for receipt type = %s.", name ) );
}
return false;
}
So , even though the xsl templates are in size range of 200 KB to 500 KB, when transformed their in-memory size is between 5 to 15 MB. We have 45 such files and altogether this consumes almost 70% of the JVM heap. When coupled with other operations which uses the heap memory the result is an OutOfMemory error from the JVM.
Memory Analyzer output from phd file (image link) :
Memory Analyzer showing hashmap entries and s9api transformation
Memory Analyzer output from phd file (image link)
Hashmap entries drilled down(image link)
The questions we have are the following: 1) Why would a template of 200 KB to 500KB files size on disk take 5 MB to 15 MB huge size in memory after transformation? 2) What can be optimized in the way templates are being created before putting to hashmap through saxon 9.6 HE or should we use other editions of saxon in a particular way to overcome this memory hog.
Please advise. Thank you for your valuable time !!
Upvotes: 1
Views: 402
Reputation: 163458
Memory occupancy of compiled stylesheets has never been something that we've seriously looked into or seen as a problem -- except possibly when generating bytecode, which we now do "on demand" to prevent the worst excesses. The focus has always been on maximum execution speed, and this means creating some quite complex data structures, e.g. the decision tables to support template rule matching. There's also a fair bit of data retained solely in order to provide good run-time diagnostics.
At some time in the past we did make efforts to ensure that the actual stylesheet tree could be garbage collected once compiled, but I've been aware that there are now references into the tree that prevent this happening. I'm not sure how significant a factor this is.
If you were running Saxon-EE then you could experiment with exporting and re-importing the compiled stylesheet. This would force out the links to data structures used only transiently during compilation, which might save some memory.
Also, Saxon-EE does JIT compilation of template rules, so if there are many template rules that are never invoked because you only use a small part of a large XML vocabulary, then this would give a memory saving.
If your 45 stylesheets have overlapping content, then moving these shared components into separately compiled XSLT 3.0 packages would be useful.
Check that you don't import the same stylesheet module at multiple precedence levels. I've seen that lead to gross inefficiencies in the past.
Meanwhile I've logged an issue at https://saxonica.plan.io/issues/4335 as a reminder to look at this next time we get a chance.
Upvotes: 1