Reputation: 31
I am getting a java outofmemory error. I added the necessary java arguments, but I still keep getting this error. I shared the libraries and functions that I use. The function It gives this error when converting a large csv file (about 15mb) into an xlsx file. It working properly on small files without any errors. How can i fix this error? Thanks.
I added these java args on Intellij Idea
Main
public class Main {
public static void main(String[] args) {
convert_CSV_to_XLSX(S.CSV_PATH,S.XLSX_PATH,"Sheet");
}
}
Convert CSV to XLSX
public void convert_CSV_to_XLSX(String inputFilePath, String outputFilePath, String sheetName) {
try {
ArrayList<ArrayList<Object>> csvObjectsAll = readCSV(inputFilePath);
writeXLSX_horizontally(outputFilePath, csvObjectsAll, sheetName);
} catch (Exception e) {
e.printStackTrace();
}
}
ReadCSV
public ArrayList<ArrayList<Object>> readCSV(String inputFilePath) {
ArrayList<ArrayList<Object>> gal = new ArrayList<>();
try {
String csvStr = new String(Files.readAllBytes(Paths.get(inputFilePath)), StandardCharsets.UTF_8);
for (String str : csvStr.split("\n")) {
ArrayList<Object> csvLinesSplit = new ArrayList<>();
String ss = str.replaceAll("\"", "");
if (ss.charAt(ss.length() - 1) == ',') {
ss += "$";
}
for (String s : ss.split(",")) {
if (s.equals("") || s.equals("$")) {
csvLinesSplit.add("");
} else {
csvLinesSplit.add(s);
}
}
gal.add(csvLinesSplit);
}
} catch (Exception e) {
}
return gal;
}
Write XLSX
public void writeXLSX_horizontally(String outputFileName, ArrayList<ArrayList<Object>> gdl, String sheetName) {
XSSFWorkbook workbook = new XSSFWorkbook();
XSSFSheet sheet = workbook.createSheet(sheetName);
int rowNum = 0;
for (ArrayList<Object> objectArrList : gdl) {
Row row = sheet.createRow(rowNum++);
int cellNum = 0;
for (Object obj : objectArrList) {
Cell cell = row.createCell(cellNum++);
boolean is_double = false, is_integer = false;
try {
cell.setCellValue(Double.parseDouble(obj.toString()));
is_double = true;
} catch (Exception e) {
}
if (!is_double) {
try {
cell.setCellValue(Integer.parseInt(obj.toString()));
is_integer = true;
} catch (Exception e) {
}
}
if (!is_double && !is_integer) {
if (obj == null) {
cell.setCellValue(new String());
} else {
cell.setCellValue(obj.toString());
}
}
}
}
try {
FileOutputStream file = new FileOutputStream(outputFileName);
workbook.write(file);
file.close();
} catch (Exception e) {
e.printStackTrace();
}
}
Upvotes: 1
Views: 2563
Reputation: 31
I removed shared memory on java virtual machine: -Xms1024M -Xmx12288M
Thanks to the @Faraz and @Lesiak perm solution write large xlsx file here:
Read CSV
public ArrayList<ArrayList<Object>> readCSV(String inputFilePath) {
ArrayList<ArrayList<Object>> gal = new ArrayList<>();
try {
BufferedReader csvReader = new BufferedReader(new FileReader(inputFilePath));
String row;
int rowSize = 0;
ArrayList<String> columnList = new ArrayList<>();
while ((row = csvReader.readLine()) != null) {
ArrayList<Object> rowCells = new ArrayList<>();
if (rowSize == 0) {
if (row.charAt(row.length() - 1) == ',')
throw new Exception("CSV Format Error");
for (String columnName : row.split(",")) {
columnList.add(columnName);
}
}
int cellSize = 0;
for (String cell : row.split(",")) {
if (cell.equals("")) {
rowCells.add(null);
} else {
rowCells.add(cell);
}
cellSize++;
}
if (cellSize != columnList.size()) {
for (int i = 0; i < columnList.size() - cellSize; i++) {
rowCells.add(null);
}
}
gal.add(rowCells);
rowSize++;
}
} catch (Exception e) {
e.printStackTrace();
}
return gal;
}
Write XLSX
public void writeXLSX_horizontally(String outputFileName, ArrayList<ArrayList<Object>> gdl, String sheetName) {
SXSSFWorkbook workbook = new SXSSFWorkbook();
SXSSFSheet sheet = workbook.createSheet(sheetName);
int rowNum = 0;
for (ArrayList<Object> objectArrList : gdl) {
Row row = sheet.createRow(rowNum++);
int cellNum = 0;
for (Object obj : objectArrList) {
Cell cell = row.createCell(cellNum++);
boolean is_double = false, is_integer = false;
try {
cell.setCellValue(Double.parseDouble(obj.toString()));
is_double = true;
} catch (Exception e) { }
if (!is_double)
try {
cell.setCellValue(Integer.parseInt(obj.toString()));
is_integer = true;
} catch (Exception e) { }
if (!is_double && !is_integer)
if (obj == null)
cell.setCellValue(new String());
else
cell.setCellValue(obj.toString());
}
}
try {
FileOutputStream file = new FileOutputStream(outputFileName);
workbook.write(file);
file.close();
} catch (Exception e) {
e.printStackTrace();
}
}
Read large xlsx file solution here: How to read XLSX file of size >40MB
Other important library for read large xlsx file: https://github.com/monitorjbl/excel-streaming-reader
constraints: xlsx file line count must be between 0..1048575
Upvotes: 0
Reputation: 25936
As discussed in comments, the problem was due to incorrect IntelliJ run configuration.
VM arguments need to be passed to a separate field in IntelliJ, not as "Program arguments".
Still, the program can be improved:
Streaming version of XSSFWorkbook implementing the "BigGridDemo" strategy. This allows to write very large files without running out of memory as only a configurable portion of the rows are kept in memory at any one time.
Note that streaming both input and output is the best option. Having said that, 15MB input is tiny by todays standards, so I believe raising heap memory a bit is not a bad short-term solution
Upvotes: 1
Reputation: 6265
This line:
String csvStr = new String(Files.readAllBytes(Paths.get(inputFilePath)), StandardCharsets.UTF_8);
Issue:
You are loading the whole file into the memory by using Files.readAllBytes
. And the allocated memory for the jvm processor on which this program is running is not enough.
Possible Solution:
You may want to start reading the file using streams/buffers like BufferedReader. Or you can lookup other Readers that allow you to read the file in bits so the whole memory is not consumed all at once.
Further Modifications:
You will have to modify your program at the time of writing also where after you read bits of data, you process and and write to a file, and when the time comes to write to a file again, you append.
Upvotes: 2