Jack Davidson
Jack Davidson

Reputation: 31

Java - OutOfMemoryError when writing large Excel file with Apache POI

I am getting a java outofmemory error. I added the necessary java arguments, but I still keep getting this error. I shared the libraries and functions that I use. The function It gives this error when converting a large csv file (about 15mb) into an xlsx file. It working properly on small files without any errors. How can i fix this error? Thanks.

I added these java args on Intellij Idea

I got error

I use this libraries

Main

public class Main {

    public static void main(String[] args) {

        convert_CSV_to_XLSX(S.CSV_PATH,S.XLSX_PATH,"Sheet");

    }

}

Convert CSV to XLSX

public void convert_CSV_to_XLSX(String inputFilePath, String outputFilePath, String sheetName) {
        try {
            ArrayList<ArrayList<Object>> csvObjectsAll = readCSV(inputFilePath);
            writeXLSX_horizontally(outputFilePath, csvObjectsAll, sheetName);
        } catch (Exception e) {
            e.printStackTrace();
        }
}

ReadCSV

public ArrayList<ArrayList<Object>> readCSV(String inputFilePath) {
        ArrayList<ArrayList<Object>> gal = new ArrayList<>();
        try {
            String csvStr = new String(Files.readAllBytes(Paths.get(inputFilePath)), StandardCharsets.UTF_8);
            for (String str : csvStr.split("\n")) {
                ArrayList<Object> csvLinesSplit = new ArrayList<>();
                String ss = str.replaceAll("\"", "");
                if (ss.charAt(ss.length() - 1) == ',') {
                    ss += "$";
                }
                for (String s : ss.split(",")) {
                    if (s.equals("") || s.equals("$")) {
                        csvLinesSplit.add("");
                    } else {
                        csvLinesSplit.add(s);
                    }
                }
                gal.add(csvLinesSplit);
            }
        } catch (Exception e) {

        }
        return gal;
}

Write XLSX

public void writeXLSX_horizontally(String outputFileName, ArrayList<ArrayList<Object>> gdl, String sheetName) {

        XSSFWorkbook workbook = new XSSFWorkbook();
        XSSFSheet sheet = workbook.createSheet(sheetName);

        int rowNum = 0;
        for (ArrayList<Object> objectArrList : gdl) {
            Row row = sheet.createRow(rowNum++);
            int cellNum = 0;
            for (Object obj : objectArrList) {
                Cell cell = row.createCell(cellNum++);
                boolean is_double = false, is_integer = false;
                try {
                    cell.setCellValue(Double.parseDouble(obj.toString()));
                    is_double = true;
                } catch (Exception e) {
                }
                if (!is_double) {
                    try {
                        cell.setCellValue(Integer.parseInt(obj.toString()));
                        is_integer = true;
                    } catch (Exception e) {

                    }
                }
                if (!is_double && !is_integer) {
                    if (obj == null) {
                        cell.setCellValue(new String());
                    } else {
                        cell.setCellValue(obj.toString());
                    }
                }
            }
        }
        try {
            FileOutputStream file = new FileOutputStream(outputFileName);
            workbook.write(file);
            file.close();
        } catch (Exception e) {
            e.printStackTrace();
        }
}

Upvotes: 1

Views: 2563

Answers (3)

Jack Davidson
Jack Davidson

Reputation: 31

I removed shared memory on java virtual machine: -Xms1024M -Xmx12288M

Thanks to the @Faraz and @Lesiak perm solution write large xlsx file here:

Read CSV

public ArrayList<ArrayList<Object>> readCSV(String inputFilePath) {
        ArrayList<ArrayList<Object>> gal = new ArrayList<>();
        try {
            BufferedReader csvReader = new BufferedReader(new FileReader(inputFilePath));
            String row;
            int rowSize = 0;
            ArrayList<String> columnList = new ArrayList<>();
            while ((row = csvReader.readLine()) != null) {
                ArrayList<Object> rowCells = new ArrayList<>();
                if (rowSize == 0) {
                    if (row.charAt(row.length() - 1) == ',')
                        throw new Exception("CSV Format Error");
                    for (String columnName : row.split(",")) {
                        columnList.add(columnName);
                    }
                }
                int cellSize = 0;
                for (String cell : row.split(",")) {
                    if (cell.equals("")) {
                        rowCells.add(null);
                    } else {
                        rowCells.add(cell);
                    }
                    cellSize++;
                }
                if (cellSize != columnList.size()) {
                    for (int i = 0; i < columnList.size() - cellSize; i++) {
                        rowCells.add(null);
                    }
                }
                gal.add(rowCells);
                rowSize++;
            }
        } catch (Exception e) {
            e.printStackTrace();
        }
        return gal;
}

Write XLSX

public void writeXLSX_horizontally(String outputFileName, ArrayList<ArrayList<Object>> gdl, String sheetName) {
        SXSSFWorkbook workbook = new SXSSFWorkbook();
        SXSSFSheet sheet = workbook.createSheet(sheetName);
        int rowNum = 0;
        for (ArrayList<Object> objectArrList : gdl) {
            Row row = sheet.createRow(rowNum++);
            int cellNum = 0;
            for (Object obj : objectArrList) {
                Cell cell = row.createCell(cellNum++);
                boolean is_double = false, is_integer = false;
                try {
                    cell.setCellValue(Double.parseDouble(obj.toString()));
                    is_double = true;
                } catch (Exception e) { }
                if (!is_double)
                    try {
                        cell.setCellValue(Integer.parseInt(obj.toString()));
                        is_integer = true;
                    } catch (Exception e) { }
                if (!is_double && !is_integer)
                    if (obj == null)
                        cell.setCellValue(new String());
                    else
                        cell.setCellValue(obj.toString());
            }
        }
        try {
            FileOutputStream file = new FileOutputStream(outputFileName);
            workbook.write(file);
            file.close();
        } catch (Exception e) {
            e.printStackTrace();
        }
}

Read large xlsx file solution here: How to read XLSX file of size >40MB

Other important library for read large xlsx file: https://github.com/monitorjbl/excel-streaming-reader

constraints: xlsx file line count must be between 0..1048575

Upvotes: 0

Lesiak
Lesiak

Reputation: 25936

As discussed in comments, the problem was due to incorrect IntelliJ run configuration.

VM arguments need to be passed to a separate field in IntelliJ, not as "Program arguments".

Still, the program can be improved:

  • process input file line by line
  • use SXSSFWorkbook to write output

Streaming version of XSSFWorkbook implementing the "BigGridDemo" strategy. This allows to write very large files without running out of memory as only a configurable portion of the rows are kept in memory at any one time.

  • use "" instead of new String()
  • not memory related: get generics right (you have strings in parsed CSV, not arbitrary objects)

Note that streaming both input and output is the best option. Having said that, 15MB input is tiny by todays standards, so I believe raising heap memory a bit is not a bad short-term solution

Upvotes: 1

Faraz
Faraz

Reputation: 6265

This line:

 String csvStr = new String(Files.readAllBytes(Paths.get(inputFilePath)), StandardCharsets.UTF_8);

Issue:

You are loading the whole file into the memory by using Files.readAllBytes. And the allocated memory for the jvm processor on which this program is running is not enough.

Possible Solution:

You may want to start reading the file using streams/buffers like BufferedReader. Or you can lookup other Readers that allow you to read the file in bits so the whole memory is not consumed all at once.

Further Modifications:

You will have to modify your program at the time of writing also where after you read bits of data, you process and and write to a file, and when the time comes to write to a file again, you append.

Upvotes: 2

Related Questions