Thomas
Thomas

Reputation: 1000

Efficient Stream of Zip Files in Java

Problem

In my service, users can download very large (5+ gigabyte) data packages as zip files. The current implementation locates all of the files within the data package, creates a new zip file, populates the zip file with copies of the files, and then streams it to the user.

This isn't scaling well with the larger data packages and I'm trying to find a way to increase the efficiency of this process. I have a proposed solution below, but I don't have experience with serving content and would like professional insight on the best way to go about this.

Attempted Solution

I think the best way is to not copy the actual bytes into the zip file. Instead, create a zipfile of symlinks, and then copy the bytes while streaming the content. I've had issues with actually copying the bytes into the zip during transfer, and I don't know if this is possible anymore.

End Solution

I implemented the accepted answer from Alexey Ragozin below into SpeedBagIt, a library for efficiently streaming zip files conforming to the BagIt specification.

Upvotes: 3

Views: 6458

Answers (1)

Alexey Ragozin
Alexey Ragozin

Reputation: 8379

You can stream large zip file directly through HTTP connection.

java.util.zip.ZipOutputStream allow to zip data directly to a stream without intermediate storage.

Below is snippet writing collection of file links into stream as zip archive.

Of cause to stream 5GiB you may need to tune timeout and thread pool of your servlet container.

import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.OutputStream;
import java.util.zip.ZipEntry;
import java.util.zip.ZipOutputStream;

public class ZipStreamer {

    public void streamZip(OutputStream os, Iterable<FileLink> entries) throws IOException {
        ZipOutputStream zos = new ZipOutputStream(os);
        for(FileLink e: entries) {

            ZipEntry entry = new ZipEntry(e.getName());
            File file = e.getFile();
            entry.setTime(file.lastModified());
            zos.putNextEntry(entry);
            if (file.isFile()) {
                copyBytes(zos, new FileInputStream(file));
            }
            zos.closeEntry();
        }
        zos.close();
    }

    private static void copyBytes(OutputStream dest, InputStream source) {
        // copy data between streams
    }

    public interface FileLink {

        public String getName();

        public File getFile();
    }
}

Upvotes: 5

Related Questions