Reputation: 1634
I've read some articles on advantages of using SAX parser for parsing XML files in java over using DOM. The one which appeals me the most (as discussed here) is that
Sax is suitable for large XML File and The SAX parser does not loads the XML file as a whole in the memory.
But now as i've written a parser using SAX to derive the entities out of an XML file for a large file of almost 1.4 GB it generates the following Exception.
org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 1; The parser has encountered more than "64,000" entity expansions in this document; this is the limit imposed by the application.
What is the problem with the memory if the file as whole is not loaded in the memory.
How can i resolve the issue?
Upvotes: 2
Views: 1426
Reputation: 2092
You can also think about using StAX.
SAX is event driven and serial. It can handle large XML, but takes a lot of CPU resources.
DOM is taking the complete document in memory.
StAX is a more recent API. It is streaming over the XML. It can be seen as a cursor or iterator over the document. It has the advantage you can skip elements that you don't need (attributes, tags, ...). It is taking a lot less CPU resources if used properly.
https://docs.oracle.com/javase/tutorial/jaxp/stax/why.html
With SAX, the XML push the events.
With StAX, you pull the XML to you.
Upvotes: 0
Reputation: 9776
Change the entity expansion limit with a JVM parameter:
-DentityExpansionLimit=1000000
Upvotes: 3