Reputation: 22011
Is it possible to efficiently append a line into a zip or gzip file?
I'm storing equity market data directly into the file system and I have around 40 different files which are being updated every 5ms.
Whats the best way of doing this?
Upvotes: 1
Views: 260
Reputation: 1
You can use a ZipOutputStream's write method to write at a given offset.
String filepath = new String("/tmp/updated.txt")
FileOutputStream fout = new FileOutputStream("/tmp/example.zip");
ZipOutputStream zout = new ZipOutputStream(fout);
byte[] file = IOUtils.toByteArray(mWriter.toString());
short yourOffset = 0;
ZipEntry ze = new ZipEntry(filepath);
try {
zout.putNextEntry(ze);
zout.write(file, yourOffset, file.length);
zout.closeEntry();
} catch(Exception e) {
e.printStackTrace();
}
If you convert your file to a byte array using Apache commons IOUtils (import org.apache.commons.io.IOUtils) you can rewrite and replace the zip entry by calling write at your the offset where the line you want to edit begins. In this case it writes the entire file, from 0 to file.length. You can replace the file in the zip by creating a ZipEntry with a path to the updated file on your drive.
Upvotes: 0
Reputation: 2739
java.nio.file.FileSystems:
http://docs.oracle.com/javase/7/docs/technotes/guides/io/fsp/zipfilesystemprovider.html
The zip file system provider introduced in the Java SE 7 release is an implementation of a custom file system provider. The zip file system provider treats a zip or JAR file as a file system and provides the ability to manipulate the contents of the file. The zip file system provider creates multiple file systems — one file system for each zip or JAR file.
TrueZip
http://truezip.java.net/
TrueZIP is a Java based plug-in framework for virtual file systems (VFS) which provides transparent access to archive files as if they were just plain directories
And remember: use memory to cache, reduce disk operations, and make the writing non-blocking.
Upvotes: 1
Reputation: 8552
One approach could be like compressing data sent over a socket and flushing compressed blocks to disk from time to time.
Upvotes: 0
Reputation: 2208
Complete cycle of editing zip file (i mean read-modify-close) will produce too much overhead. I think it is better to accumulate changes in memory and modify target file at some reasonable rate (i.e. every 5 seconds or even more).
Upvotes: 0