Reputation: 676
I am working on a large CSV (~200 mb of text file) which I would like to convert into excel sheet but the workbook becomes so memory consuming that in the middle of the process, Java throws "GC Overhead limit exceeded"! I have checked the code if I am generating dummy references but I think none exists.
In my opinion those library calls from Apachi - POI might generate some references that keeps garbage collector so busy.
My question is if I could just write the workbook into a file chunk by chunk like text file something like appending to a text file without bringing it into memory. Is there any solution for that or am I missing something here?
GC throws the exception in the following code:
private void updateExcelWorkbook(String input, String fileName, Workbook workbook) {
try {
Sheet sheet = workbook.createSheet(fileName);
// Create a new font and alter it.
Font font = workbook.createFont();
font.setFontHeightInPoints((short) 11);
font.setBold(true);
// Fonts are set into a style so create a new one to use.
CellStyle style = workbook.createCellStyle();
style.setFont(font);
Row row;
Cell cell;
String[] columns;
String[] lines = input.split("\n");
int colIndex;
int rowIndex = 1;
for (String line : lines) {
row = sheet.createRow(rowIndex++);
columns = line.split("\t");
colIndex = 0;
for (String column: columns) {
cell = row.createCell(colIndex++);
if (rowIndex == 1)
cell.setCellStyle(style);
cell.setCellValue(column);
}
}
} catch (Exception ex) {
System.out.println(ex.getMessage());
}
}
Upvotes: 2
Views: 2898
Reputation: 159175
Seems you are using the POI usermodel, which has a very high memory footprint, because it keeps the entire worksheet in memory, similar to how DOM keeps an entire XML document in memory.
You need to use a streaming API. Using POI, you can create .xlsx
files using the SXSSF Buffered Streaming API, as mentioned here: https://poi.apache.org/spreadsheet/index.html#SXSSF+(Since+POI+3.8+beta3)
The page linked above has this image, showing the Spreadsheet API Feature Summary of POI:
(source: apache.org)
Upvotes: 3