blank
blank

Reputation: 18180

Editing a large xml file 'on the fly'

I've got an xml file stored in a database blob which a user will download via a spring/hibernate web application. After it's retrieved via Hibernate as a byte[] but before it's sent to the output stream I need to edit some parts of the XML (a single node with two child nodes and an attribute).

My concern is if the files are larger (some are 40mb+) then I don't really want to do this by having the whole file in memory, editing it and then passing it to the user via the output stream. Is there a way to edit it 'on the fly' ?

byte[] b = blobRepository.get(blobID).getFile();
// What can I do here?
ServletOutputStream out = response.getOutputStream();
out.write(b);

Upvotes: 3

Views: 1182

Answers (2)

Arvik
Arvik

Reputation: 2624

You can try the following:

  1. Enable binary data streaming in Hibernate (set hibernate.jdbc.use_streams_for_binary to true)
  2. Receive xml file as binary stream with ent.getBlob().getBinaryStream()
  3. Process input stream with XSTL processor that supports streaming (e.g. saxon) redirecting output directly to servlet OutputStream: javax.xml.transform.Transformer.transform(SAXSource, new StreamResult(response.getOutputStream()))

Upvotes: 2

Edwin Buck
Edwin Buck

Reputation: 70999

You can use a SAX stream.

Parse the file using the SAX framework, and as your Handler receives the SAX events, pass the unchanged items back out to a SAX Handler that constructs XML output.

When you get to the "part to be changed", then your intermediary class would read in the unwanted events, and write out the wanted events.

This has the advantage of not holding the entire file in memory as an intermediate representation (say DOM); however, if the transformation is complex, you might have to cache a number of items (sections of the document) in order to have them available for rearranged output. A complex enough transformation (one that could do anything) eventually turns into the overhead of DOM, but if you know you're ignoring a large portion of your document, you can save a lot of memory.

Upvotes: 2

Related Questions