Lizzie
Lizzie

Reputation: 351

Large Excel Sheet Generation Optimization

I am trying to generate a .xls file containing almost 13000 lines and 3 columns using the POI library. But it is taking almost 8-10 minutes to generate that complete file. Can anyone suggest how can I reduce the execution time?

public static void generateReconReport(Connection con,String neName,String reportTable) throws SQLException, IOException{
    Statement stmt = con.createStatement(ResultSet.TYPE_SCROLL_INSENSITIVE,ResultSet.CONCUR_READ_ONLY);

    String sql = "SELECT * FROM "+reportTable;
    ResultSet rsNERecon = stmt.executeQuery(sql);

    System.out.println("Resulset obtained, Generating Report");

    Date date = new Date();
    SimpleDateFormat sid = new SimpleDateFormat("MMddyyyy");
    String curDate = sid.format(date);

    String fileName = neName.toUpperCase()+"_" + curDate + ".xlsx";

    File ob_file = new File(fileName);
    if(!ob_file.exists())
        ob_file.createNewFile();

    HSSFWorkbook hsfWb = new HSSFWorkbook();

    HSSFSheet hsfSheet =  hsfWb.createSheet(neName.toUpperCase()+" Recon Report");

    HSSFRow hsfRow = hsfSheet.createRow(0);

    ResultSetMetaData metaRs = rsNERecon.getMetaData();
    int colCount = metaRs.getColumnCount();

    for (int j = 1; j <= colCount; j++) {
        String colName = metaRs.getColumnName(j);
        hsfRow.createCell(j).setCellValue(colName);
    }
    FileOutputStream fileOut =  new FileOutputStream(fileName);
    int rowNum = 1;
    while (rsNERecon.next()) {
        hsfRow = hsfSheet.createRow(rowNum);
        for (int j = 1; j <= colCount; j++) 
            hsfRow.createCell(j).setCellValue(rsNERecon.getString(j));
        rowNum++;
    }

    for (int j = 1; j <= colCount; j++) 
        hsfSheet.autoSizeColumn(j);

    rsNERecon.close();
    stmt.close();

    hsfWb.write(fileOut);
    fileOut.close();
    System.out.println("Report generated for "+neName.toUpperCase());
}

Upvotes: 3

Views: 1809

Answers (2)

Jorge_B
Jorge_B

Reputation: 9872

In my experience I have found POI very useful for reading Excel files, but quite time and memory consuming when it comes to generate them. I can offer you 3 options:

1) By the classes you use (HSSF*), I think you may be using an older version of POI. I would suggest you to switch to the latest version and see if there comes any improvement. With little code changes you can even get to a version of the code that generates both .xls and .xlsx formats, so you can profile both options (assuming you can generate any other format than .xls) -- EDITED: I have had recently the chance to test SXSSF performance and it has a great performance, both in terms of memory and speed

2) As someone else suggested, generating a .csv would be much faster, but you would lose any formatting (and that assuming you can generate a format other than .xls)

3) If you can use a format different from .xls, I suggest you have a look at the Excel XML Spreadsheet format. We have been using it for years, the downfall being that we had to work our own implementation of the format. However, if you are willing to invest a day or two implementing it, your implementation should be much more efficient than POI in time and memory use.

http://msdn.microsoft.com/en-us/library/office/aa140066%28v=office.10%29.aspx

Upvotes: 1

Bathsheba
Bathsheba

Reputation: 234635

Generating a character separated value file (csv) using either a tab or a comma as the delimiter will be much faster. Save the file with a .csv extension.

Excel is very quick at reading these files.

Upvotes: 2

Related Questions