Dattatray Satpute
Dattatray Satpute

Reputation: 263

Performance issue with xalan api

Following method takes 33 second(execution time) for executing 10000 iterations. CachedXPathAPI is from org.apache.xpath.CachedXPathAPI. I am using xalan-2.7.0.jar. Please any one can help here how can i decrease the execution time. If we increase the load let say 40000 iteration then it takes 10 minute for execution. whole method readXMLData is called from for loop

public static Hashtable<String, NodeList> readXMLData(CachedXPathAPI cashedXPath, org.w3c.dom.Document doc, String nodePath, int nodeInstance) throws Exception
{
    
    Hashtable<String, NodeList> input = null;

    try
    {
        NodeList rowNodes = cashedXPath.selectNodeList(doc, nodePath);
        // NodeList rowNodes = XPathAPI.selectNodeList( doc, nodePath);
        if (rowNodes == null)
            return null;

        if (rowNodes.getLength() <= 0)
            return null;

        Element rowNode = (Element) rowNodes.item(nodeInstance);
        if (rowNode == null)
            return null;

        NodeList rowElements = rowNode.getChildNodes();
        if (rowElements == null)
            return null;

        input = new Hashtable<String, NodeList>();

        for (int elementIndex = 0; elementIndex < rowElements.getLength(); elementIndex++)
        {
            Node rowElement = rowElements.item(elementIndex);

            if (rowElement.getNodeType() == Node.ELEMENT_NODE)
            {
                Element elem = (Element) rowElement;
                String name = elem.getNodeName();

                if (elem.hasChildNodes())
                {
                    NodeList child = elem.getChildNodes();
                    if (child != null)
                    {
                        input.put(name, child);
                    }
                } else if (elem.hasAttributes())
                {
                    input.put(name, (NodeList) rowElement);
                }
            }
        }

        return input;

    } catch (TransformerException ex)
    {
        throw new Exception("readXMLData (TransformerException): " + ex.getMessage());
    } catch (Exception ex)
    {
        throw new Exception("readXMLData (Exception): " + ex.getMessage());
    }
    
}

Upvotes: 0

Views: 449

Answers (1)

Mr R
Mr R

Reputation: 794

Firstly I would use HashMap - but make the signature of the method Map<String, NodeList> (**).

The signature of your method suggests you might be doing something like this (or at least processing the same nodePath in a loop over and over ...

readXMLData(cashedXPath, doc, nodePath, 1);
readXMLData(cashedXPath, doc, nodePath, 2);
readXMLData(cashedXPath, doc, nodePath, 3);
readXMLData(cashedXPath, doc, nodePath, 4);
readXMLData(cashedXPath, doc, nodePath, 5);

IF that is the case then the first obvious thing is that the selectNodeList code is being run unnecessarily over and over - it only needs to be run once for a set of row nodes with the same nodePath.

NodeList rowNodes = cashedXPath.selectNodeList(doc, nodePath);

Presumably that call has to hit a significant portion of the document - and it evaluates every match of the XPAth, even if you are only using the first [so the more matches there are in the document, the more wasteful this is].

Alternatively if this isn't significant I would comment out everything else and see whether that's a lot of your total processing time. If a lot of time is spent in the rest of the code - do the same and break it down.

Another alternative to consider is how big / how much memory is being used ... (**) each time the method processes the rowElements it's keeping in a map some of that data from the DOM. If you are keeping what is returned - then you are keeping references to effectively temporary data structures - so the memory usage goes up and up, and this might be causing a lot of garbage collection ... One solution - might be to increase the memory size the app can run in.. Another might be to work out what of that DOM you really need and keep the values in it (e.g. not the DOM structure, but perhaps the leaf content [but not any of the DOM objects - so all that temporary structure related to the XPath result can be released (and GC'd).

Upvotes: 1

Related Questions