Pawan
Pawan

Reputation: 311

Load a large xlsx file from INPUT-STREAM throwing OutOfMemoryError Apache POI

I want to read a large excel-file(.xlsx/.xls). When I upload a 20MB file, suddenly Java-Heap was increased by 2GB and ran into OutOfMemoryError.

private Sheet getSheetForFileType(String filType, InputStream fileData) throws IOException {
    Workbook workbook;
    Sheet sheet;
    if (filType.equalsIgnoreCase("xls")) {
        workbook = new HSSFWorkbook(fileData); //OutOfMemoryError
        sheet = workbook.getSheetAt(0);
    } else {
        workbook = new XSSFWorkbook(fileData); //OutOfMemoryError
        sheet = workbook.getSheetAt(0);
    }
    return sheet;
}

As mentioned here Apache-POI overview , I tried with XSSF and SAX (Event API) Modified code as below:

private Sheet getSheetForFileType(String filType, InputStream fileData) throws IOException {

    if (filType.equalsIgnoreCase("xls")) {
        ....
    } else {
        OPCPackage opcPackage = OPCPackage.open(fileData);  //OutOfMemoryError
        XSSFReader xssfReader = new XSSFReader(opcPackage);
        SharedStringsTable sharedStringsTable = xssfReader.getSharedStringsTable();
        XMLReader parser = getSheetParser(sharedStringsTable);
        ....
        ....
    }
    return sheet;
}

Yet, I'm unable to load-file and read it.

I read file-data from Input-Stream, purpose is ONLY to Read-Data no write operations on it.

Reading a File takes lower memory, while an InputStream requires more memory as it has to buffer the whole file.

I went through other posts, what i understand:


Update-1: Added a sample-excel picture.

sample excel

Upvotes: 1

Views: 2233

Answers (1)

kels
kels

Reputation: 146

Try to use very efficient and high performance streaming SXSSFWorkbook class instead of XSSFWorkbook (which keeps the entire Excel workbook in memory) like below:

SXSSFWorkbook workbook = new SXSSFWorkbook(100);

where 100 is the default number of rows that will be kept in memory and processed in real time.

Upvotes: 0

Related Questions