D.Q.
D.Q.

Reputation: 547

What's the fastest way to deserialize and serialize an xml document?

I'm using java 6 and processing some xml documents which are pretty large...I need to parse them and modify some values and then serialize back to the disk.

I used org.w3c.DOM to deserialize the xml documents and modified some attribute values, and I used JAXP Transformer to serialize the dom document that has been changed. But I found that it is really slow...

So I'm wondering is there a more efficient way to serialize the dom document, or handle large xml documents?

UPDATES:

I used a timer to record how long it takes for each part, below is for the serialization:

// serialize the updated DOM
TransformerFactory transformerFactory = TransformerFactory.newInstance();
Transformer transformer = transformerFactory.newTransformer();

long t0 = timer.currentTimeMillis();
DOMSource source = new DOMSource(dom);
StreamResult result = new StreamResult(doc);
transformer.transform(source, result);
long t1 = timer.currentTimeMillis();

Reporter.log("Finished serializing " + doc.getAbsolutePath() + " in " + (((t1 - t0)) / 1000.0f) + " s.", true);

And the log shows:

....
Finished serializing C:\Usrs\Adminstrator\Documents\Docs\InitialDocument_1.xml in 53 s.

Upvotes: 0

Views: 1926

Answers (4)

Michael Kay
Michael Kay

Reputation: 163342

50 seconds to serialize 90Kb is crazy. DOM is slow, but not that slow; there's something going wrong and I can't tell what.

It's seriously misleading to describe 90Kb as "large", however, and that misdescription may have influenced some of the answers.

How fast do you need it to be? My guess is that the standard transformation mechanisms such as XSLT are quite fast enough.

The other relevant factor is, what exactly are the changes you need to make to the content? A lot depends on the complexity of the logic needed.

Upvotes: 1

srini.venigalla
srini.venigalla

Reputation: 5145

have you tried using the SAX interface?

If you need really fast processing of very large xml documents , you have to eschew DOM structure. Take a look at the non-dom parsers like this:

http://vtd-xml.sourceforge.net/

Upvotes: 2

Evgeniy Dorofeev
Evgeniy Dorofeev

Reputation: 136042

The fastest way is StAX. The simplest way is JAXB.

Upvotes: 1

dkaustubh
dkaustubh

Reputation: 454

You should consider using StAX. DOM is not suitable here. You can see the comparison here.

http://docs.oracle.com/cd/E17802_01/webservices/webservices/docs/1.6/tutorial/doc/SJSXP2.html

You can refer to the below URL for sample code.

http://docs.oracle.com/javaee/5/tutorial/doc/bnbfl.html

Upvotes: 4

Related Questions