Reputation: 296
I have a program that load data into a DB thru namedpipes, very cool. This program was running for about 2 years and accept text files or gzip.
But now appeared some zip to load and I want to improve it. But I can't put this to work, I'm getting an OutOfMemoryError.
(Of course, I'm calling this using -Xms512M -Xmx2048M)
Below is how I get the InputStream:
PipeLoader.java
protected BufferedReader getBufferedReader(File file, String compression) throws Exception {
BufferedReader bufferedReader = null;
if(compression.isEmpty()) {
bufferedReader = new BufferedReader(new FileReader(file), BUFFER);
} else if(compression.equalsIgnoreCase("gzip")) {
InputStream fileStream = new FileInputStream(file);
InputStream gzipStream = new GZIPInputStream(fileStream);
// Works fine
Reader reader = new InputStreamReader(gzipStream);
bufferedReader = new BufferedReader(reader, BUFFER);
} else if(compression.equalsIgnoreCase("zip")){
InputStream fileStream = new FileInputStream(file);
ZipInputStream zipStream = new ZipInputStream(fileStream);
zipStream.getNextEntry(); // For testing purposes I'm getting only the first entry
Reader reader = new InputStreamReader(zipStream); // Works only with small zips
bufferedReader = new BufferedReader(reader, BUFFER);
}
return bufferedReader;
}
I'm also tried with TrueVFS library:
// The same: works with small zip files, OutOfMemoryError with big zip files
TFile tFile = new TFile(file);
TFileInputStream tfis = new TFileInputStream(new TFile(tFile.getAbsolutePath(), tFile.list()[0]));
Reader reader = new InputStreamReader(tfis);
bufferedReader = new BufferedReader(reader, BUFFER);
And yes, I'm closing everything properly (remember, works with gz!).
In this case I need to load some zip file with only 1 plain textfile inside (~4GB zipped, ~35GB unzipped)
I got an OutOfMemoryError in the first file, in less than 1min from the start.
PS.: This is not a duplicate from Reading a huge Zip file in java - Out of Memory Error, he had the option to read each one of the small files from inside the zip, but I have only 1 big file.
I ran with -XX:+HeapDumpOnOutOfMemoryError and readed the .hprof file with Memory Analyser, but it doesn't help me much =/:
Please, I need help.
Upvotes: 1
Views: 3002
Reputation: 5950
If you look at the stacktrace, you can see that BufferedReader.readLine()
ultimately leads to the creation of a very large array, which is causing the OutOfMemoryError
.
Since readLine()
keeps reading the input until it reaches a line break, this indicates that there are no (or very few) line breaks in the zipped input file.
Upvotes: 2