sparrow
sparrow

Reputation: 460

Decompress large binary files

I have a function to decompress large zip files using the below method. They are times where I run into OutOfMemoryError error because the file is just too large. Is there a way I can optimize my code? I have read something about breaking the file into smaller parts that can fit into memory and decompress but I don't know how to do that. Any help or suggestion is appreciated.

private static String decompress(String s){
        String pathOfFile = null;

        try(BufferedReader reader = new BufferedReader(new InputStreamReader(new GZIPInputStream(new FileInputStream(s)), Charset.defaultCharset()))){
            File file = new File(s);
            FileOutputStream fos = new FileOutputStream(file);

            String line;
            while((line = reader.readLine()) != null){
                fos.write(line.getBytes());
                fos.flush();
            }

            pathOfFile = file.getAbsolutePath();
        } catch (IOException e) {
            e.printStackTrace();
        }

        return pathOfFile;
    }

The stacktrace:

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
        at java.base/java.util.Arrays.copyOf(Arrays.java:3689)
        at java.base/java.util.ArrayList.grow(ArrayList.java:237)
        at java.base/java.util.ArrayList.ensureCapacity(ArrayList.java:217)

Upvotes: 0

Views: 499

Answers (1)

Karol Dowbecki
Karol Dowbecki

Reputation: 44932

Don't use Reader classes because you don't need to write output file character by character or line by line. You should read and write byte by byte with InputStream.transferTo() method:

try(var in = new GZIPInputStream(new FileInputStream(inFile));
    var out = new FileOutputStream(outFile)) {
    in.transferTo(out);           
}

Also you probably don't need to call flush() explicitly, doing it after every line is wasteful.

Upvotes: 2

Related Questions