TOMAS DEL CASTILLO
TOMAS DEL CASTILLO

Reputation: 177

Is there any way to read both .xls and .xlsx files using Apache POI?

I need to create a method that can read both xls and xlsx files. According to my research, HSSF is used to read xls and XSSF to read xlsx. Is there a part of the Apache POI I can use to read both files? I also came across the ss.usermodel but found no sufficient codes that will entertain both xls and xlsx....

Upvotes: 16

Views: 32524

Answers (7)

tanle
tanle

Reputation: 61

You can use

Workbook wb = WorkBookFactory().create(inputStream); 

Upvotes: 3

Prashant Gautam
Prashant Gautam

Reputation: 609

you can read using poi-ooxml and poi-ooxml-schema jars provided by apache.

and use below code:--

Workbook wb = null;
excelFileToRead = new FileInputStream(fileName);
wb = WorkbookFactory.create(excelFileToRead); 
Sheet sheet = wb.getSheet(sheetName);

the above code will read both xls and xlsx files

Upvotes: 13

Amit
Amit

Reputation: 545

Thanks to Tom's answer just to add, use foll. code to get inputstream else we may face Exception in thread "main" java.io.IOException: mark/reset not supported

     InputStream inputStream = new FileInputStream(new File("C:\\myFile.xls"));

     if(! inputStream.markSupported()) {
                inputStream = new PushbackInputStream(fileStream, 8);
     }   

Upvotes: 3

tom
tom

Reputation: 2714

Yes, there's a new set of interfaces provided by POI that work with both types.

Use the WorkbookFactory.create() method to get a Workbook: http://poi.apache.org/apidocs/org/apache/poi/ss/usermodel/WorkbookFactory.html

You can check for excel files without relying on file extensions (which are unreliable - many csv files have xls extensions for example but cannot be parsed by POI) using the following:

//simple way to check for both types of excel files
public boolean isExcel(InputStream i) throws IOException{
    return (POIFSFileSystem.hasPOIFSHeader(i) || POIXMLDocument.hasOOXMLHeader(i));
}

Upvotes: 20

Anantha Sharma
Anantha Sharma

Reputation: 10098

It appears you are looking for a way to abstract the read process, you are saying it doesn't matter if its XLS or XLSX, you want your code to work without modification.

I'd recommend you to look at Apache Tika, its an awesome library that abstracts file reading and content parsing, it uses POI and many other libraries and has a nice abstraction to all of them.

reading a PDF/XLS/XLSX is similar to reading a text file, all the work is done behind the scene.

read this for more. http://www.searchworkings.org/blog/-/blogs/introduction-to-apache-tika

Upvotes: 0

Balaji Krishnan
Balaji Krishnan

Reputation: 1017

one option would be to check the file name with lastIndexOf for . and see if it is .xls or xlsx and then use an if condition to switch accordingly. been a long time since i worked on poi but i think it the attributes are like HSSF for .xls and XSSF for .xlsx refer http://poi.apache.org/ site, last line under the topic Why should I use Apache POI?

Upvotes: 1

Sumit Gupta
Sumit Gupta

Reputation: 447

I haven't had much exp with Apache POI, but as far as i know if you refer to a workbook by class "Workbook" then you can read and write both xls & xlsx.

All you have to do is when creating object write

for .xls-

Workbook wb = new HSSFWorkbook();

for .xlsx-

Workbook wb = new XSSFWorkbook();

you can pass a parameter for file type and create the WorkBook object accordingly using If statement.

Upvotes: 19

Related Questions