Reputation: 1585
I am getting this error in my mapper class . I am reading big zip file using ZipFileInputFormat that will unzip and using ZipFileRecordReader i am converting it in key as file name and content of the file as value .I have to split the content using my delimiter and insert it into HBase table . Size of the zip file is very huge and its not split able . My code is working for the smaller zip file but when i run this for huge zip file it throw this error . This is where problem occurs.
// Read the file contents
ByteArrayOutputStream bos = new ByteArrayOutputStream();
byte[] temp = new byte[8192];
while ( true )
{
int bytesRead = 0;
try
{
bytesRead = zip.read( temp, 0, 8192 );
}
catch ( EOFException e )
{
if ( ZipFileInputFormat.getLenient() == false )
throw e;
return false;
}
if ( bytesRead > 0 )
bos.write( temp, 0, bytesRead );
else
break;
}
I tried increasing 8192 to some big number but then also same error .
This is how i run my mapreduce .
hadoop jar bulkupload-1.0-jar-with-dependencies.jar -Dmapreduce.map.memory.mb=8192 -Dmapreduce.map.java.opts=Xmx7372m FinancialLineItem FinancialLineItem sudarshan/output3
9
In my mapper code i iterate over content of the file then split it and then insert into HBase .
NOTE:File size is very huge .
Upvotes: 1
Views: 14609
Reputation: 509
Is your file stores in hdfs ?. If not you can put your file in hdfs and then run a job to simply load and store the contents to some other location. Then you can run job on this new location and old zipped location can be discarded. The file size you are specifying is of zipped file I guess , which after unzipp op will be much larger.
Upvotes: 0
Reputation: 26882
Well, you seem to be reading a large file into memory. You would expect that to cause OOME. You need to stop having all part of the file in memory at once.
Upvotes: 1
Reputation: 11
It simply means that the JVM ran out of memory. When this occurs, you basically have 2 choices:
-->Allow the JVM to use more memory using the -Xmx VM argument. For instance, to allow the JVM to use 1 GB (1024 MB) of memory -->Improve/Fix the application so that it uses less memory
Upvotes: 1
Reputation: 18825
According to error I believe that it's not about the size of the zip file, but about the fact that the uncompressed file is stored into memory. All the data is written into ByteArrayOutputStream
which needs to maintain an array of the bytes and when growing, at some time it will run out of memory.
Not familiar with the purpose of the code, but I guess the best solution would be to store it into some temporary file, maybe map into memory and then do some operations on it.
Upvotes: 0