Reputation: 47
This coding is able to read the small data of excel file... but not reading the large data files in excel files.... how to modify the code further?
import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.IOException;
import java.sql.SQLException;
import java.util.Iterator;
import org.apache.poi.ss.usermodel.Cell;
import org.apache.poi.ss.usermodel.Row;
import org.apache.poi.xssf.usermodel.XSSFSheet;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;
/**
*
* @author Administrator
*/
public class ReadExcelNdArray {
public static void main(String[] args) throws Exception {
long start = System.currentTimeMillis();
System.out.println("Time taken: " + (System.currentTimeMillis() - start) + " ms");
File myFile = new File("D://Raghulpr/Transaction Data.xlsx");
FileInputStream fis = new FileInputStream(myFile);
// Finds the workbook instance for XLSX file
XSSFWorkbook myWorkBook = new XSSFWorkbook (fis);
// Return first sheet from the XLSX workbook
XSSFSheet mySheet = myWorkBook.getSheetAt(0);
// Get iterator to all the rows in current sheet
Iterator<Row> rowIterator = mySheet.iterator();
// Traversing over each row of XLSX file
while (rowIterator.hasNext()) {
Row row = rowIterator.next();
// For each row, iterate through each columns
Iterator<Cell> cellIterator = row.cellIterator();
while (cellIterator.hasNext()) {
Cell cell = cellIterator.next();
switch (cell.getCellType()) {
case Cell.CELL_TYPE_STRING:
System.out.print(cell.getStringCellValue() + "\t");
break;
case Cell.CELL_TYPE_NUMERIC:
System.out.print(cell.getNumericCellValue() + "\t");
break;
case Cell.CELL_TYPE_BOOLEAN:
System.out.print(cell.getBooleanCellValue() + "\t");
break;
default :
}
}
System.out.println("");
}
}
}
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at java.io.ByteArrayOutputStream.<init>(ByteArrayOutputStream.java:77)
at org.apache.poi.openxml4j.util.ZipInputStreamZipEntrySource$FakeZipEntry.<init>(ZipInputStreamZipEntrySource.java:121)
at org.apache.poi.openxml4j.util.ZipInputStreamZipEntrySource.<init>(ZipInputStreamZipEntrySource.java:55)
at org.apache.poi.openxml4j.opc.ZipPackage.<init>(ZipPackage.java:88)
at org.apache.poi.openxml4j.opc.OPCPackage.open(OPCPackage.java:272)
at org.apache.poi.util.PackageHelper.open(PackageHelper.java:37)
at org.apache.poi.xssf.usermodel.XSSFWorkbook.<init>(XSSFWorkbook.java:254)
at readexcelndarray.ReadExcelNdArray.main(ReadExcelNdArray.java:36)
Upvotes: 0
Views: 18917
Reputation: 157
I don't know if you still need answer to this, but I was also searching for the same and was struggling to read a large file . After spending a lot of time all over the internet I found one solution to this . You can check Excel streaming reader
import com.monitorjbl.xlsx.StreamingReader;
InputStream is = new FileInputStream(new File("G:\\Book1.xlsx"));
Workbook workbook = StreamingReader.builder()
.rowCacheSize(100)
.bufferSize(4096)
.open(is);
Now you can use workbook to process your file further .
I was able to process xlsx file having more than 4 lac records .
Upvotes: 4
Reputation: 4506
I've had the same problem, if you change to the much lower level SAX parsing instead you will save a lot of memory. http://poi.apache.org/spreadsheet/how-to.html#xssf_sax_api
I think I reduced about 4.5 GB(!) memory usage (about 11MB file with a lot of formulas) down to something more manageable (don't remember exactly, but it was so low it didn't matter anymore, at least reduced by a factor of 10).
Harder to implement but worth the time if you need to reduce memory footprint
Upvotes: 0
Reputation: 49
You need to increase the heap size so as to read the large files.I suggest using 64bit machine.
Upvotes: 0
Reputation: 6611
Firstly you need to close all Input - output
stream object like FileInputStream
etc in your code. Secondly, you can also increase your JVM heap space as mention in this link: Increase heap size in Java
Upvotes: 2
Reputation: 3872
We have jxl api for reading, writing excel files. The problem with this api is at the max you can read and write 65535 rows while starting row is indexed at 0. But it's really flexible.
Since, number of rows are more than 65535 in your case, I would suggest you to prefer Apache POI. Virtually, there is no limit for this api.
Upvotes: 0