Reputation: 341
I'm using Apache POI to create and save records into Workbook. I have almost 5000+ new records to be written and saved into the workbook. But at the time of writing the fileOutputStream into the workbook, the execution basically halts and slowed down.
What I mean to say is, at the time of executing this line:
workbook.write(fileOutputStream);
it almost stops to process 5000+ records. I validated that it's taking nearly 1 hour (!) to write in the workbook.
How can I improve the performance and overcome this drawback?? Please suggest...
** Note: The rest of the codes are normal Apache POI related codes and they are executing fine, no issue, hence I didnot mention all of them. Only I got stuck at the above line.
I found one discussion here: FileOutputStream (Apachhe POI) taking too long time to save
but, it did not help me. I need to save the whole file.
Upvotes: 5
Views: 12270
Reputation: 66
If you are using merged cells, this answer might be helpful.
I once had 3000+ records and it took 10 minutes to generate the output xlsx.
After using a Java profiler, I found that
org.apache.poi.xssf.usermodel.XSSFSheet#getMergedRegion
took most of the time.
Based on my data set, I found this method grows in O(n^2) (n is the count of records), which explains why it works for small records set(less than 1K) but takes a lot of time for large records set.
I checked the template and output, it had a lot of merged cells generated by jx:each
:
Excel headers
| A | B | C |
| headers |
`jx:each` cells
| a | b | <- merged
| a | b |
...
| footers |
So I unmerged the cells in jx:each
template, and it takes less than 1 second now.
Upvotes: 1
Reputation: 341
One more solution I understand, like, while iterating over the Row and creating cells, DO NOT keep declaring CellStyle
and sheet.autoSizeColumn(colNumber)
inside the loop, rather declare these 2 only once at the outside of the loop and set the values and style only inside the loop, i.e, cell.setCellStyle
and cell.setCellValue
.
Declaring the above 2 everytime while iterating, basically degrades the performance of the POI radically.
Upvotes: 8
Reputation: 61852
Let's have a concrete example we can talk about:
import java.io.FileOutputStream;
import org.apache.poi.ss.usermodel.*;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;
import org.apache.poi.xssf.streaming.SXSSFWorkbook;
import org.apache.poi.hssf.usermodel.HSSFWorkbook;
import java.util.GregorianCalendar;
class CreateExcel100000Rows {
public static void main(String[] args) throws Exception {
System.out.println("whole program starts " + java.time.LocalDateTime.now());
try (
//Workbook workbook = new XSSFWorkbook(); FileOutputStream fileout = new FileOutputStream("Excel.xlsx")
//Workbook workbook = new SXSSFWorkbook(); FileOutputStream fileout = new FileOutputStream("Excel.xlsx")
Workbook workbook = new HSSFWorkbook(); FileOutputStream fileout = new FileOutputStream("Excel.xls")
) {
int rows = 100000;
if (workbook instanceof HSSFWorkbook) rows = 65536;
Object[][] data = new Object[rows][4];
data[0] = new Object[] {"Value", "Date", "Formatted value", "Formula"};
for (int i = 1; i < rows; i++) {
data[i] = new Object[] {1.23456789*i, new GregorianCalendar(2000, 0, i), 1.23456789*i, "ROUND(A" + (i+1) + ",2)"};
}
DataFormat dataFormat = workbook.createDataFormat();
CellStyle dateStyle = workbook.createCellStyle();
dateStyle.setDataFormat(dataFormat.getFormat("DDDD, MMMM, DD, YYYY"));
CellStyle numberStyle = workbook.createCellStyle();
numberStyle.setDataFormat(dataFormat.getFormat("#,##0.00 \" Coins\""));
Sheet sheet = workbook.createSheet();
sheet.setColumnWidth(0, 12*256);
sheet.setColumnWidth(1, 35*256);
sheet.setColumnWidth(2, 17*256);
sheet.setColumnWidth(3, 10*256);
for (int r = 0; r < data.length; r++) {
Row row = sheet.createRow(r);
for (int c = 0; c < data[0].length; c++) {
Cell cell = row.createCell(c);
if (r == 0) cell.setCellValue((String)data[r][c]);
if (r > 0 && c == 0) {
cell.setCellValue((Double)data[r][c]);
} else if (r > 0 && c == 1) {
cell.setCellValue((GregorianCalendar)data[r][c]);
cell.setCellStyle(dateStyle);
} else if (r > 0 && c == 2) {
cell.setCellValue((Double)data[r][c]);
cell.setCellStyle(numberStyle);
} else if (r > 0 && c == 3) {
cell.setCellFormula((String)data[r][c]);
}
}
}
System.out.println("write starts " + java.time.LocalDateTime.now());
workbook.write(fileout);
System.out.println("write ends " + java.time.LocalDateTime.now());
if (workbook instanceof SXSSFWorkbook) ((SXSSFWorkbook)workbook).dispose();
}
System.out.println("whole program ends " + java.time.LocalDateTime.now());
}
}
This code creates a HSSFWorkbook
having the first sheet filled from row 1 to row 65,536 having different kind of cell values in columns A:D
.
Using java -Xms256M -Xmx512M
, that is heap space from 256 to 512 MByte, this takes 2 seconds in whole. HSSFWorkbook.write
takes less than a second.
If you do
...
try (
Workbook workbook = new XSSFWorkbook(); FileOutputStream fileout = new FileOutputStream("Excel.xlsx")
//Workbook workbook = new SXSSFWorkbook(); FileOutputStream fileout = new FileOutputStream("Excel.xlsx")
//Workbook workbook = new HSSFWorkbook(); FileOutputStream fileout = new FileOutputStream("Excel.xls")
) {
...
This code creates a XSSFWorkbook
having the first sheet filled from row 1 to row 100,000 having different kind of cell values in columns A:D
.
Using java -Xms256M -Xmx512M
, that is heap space from 256 to 512 MByte, this takes 7 seconds in whole. XSSFWorkbook.write
takes 2 seconds. This can be improved by giving more available heap space.
If you do
...
try (
//Workbook workbook = new XSSFWorkbook(); FileOutputStream fileout = new FileOutputStream("Excel.xlsx")
Workbook workbook = new SXSSFWorkbook(); FileOutputStream fileout = new FileOutputStream("Excel.xlsx")
//Workbook workbook = new HSSFWorkbook(); FileOutputStream fileout = new FileOutputStream("Excel.xls")
) {
...
This code creates a SXSSFWorkbook
having the first sheet filled from row 1 to row 100,000 having different kind of cell values in columns A:D
.
Using java -Xms256M -Xmx512M
, that is heap space from 256 to 512 MByte, this takes 2 seconds in whole. SXSSFWorkbook.write
takes less than a second.
Note: Using SXSSFWorkbook
, ((SXSSFWorkbook)workbook).dispose()
is necessary to get rid of the used temporary files.
Upvotes: 4