Reputation: 1425
I'm making a java application, which needs to read values from a web generated .xls file.
Unfortunately that .xls file is not a real .xls file, its a bunch of html tags and the auto-generation program just changes its extension to .xls.
To read the value of cells in the auto-generated file, I was intending to use Apache-POI library. But it seems like the library reads only from legit .xls files. Upon running the code, it gives following error
java.io.IOException: Invalid header signature; read 0x6D74683C0A0D0A0D, expected 0xE11AB1A1E011CFD0
at org.apache.poi.poifs.storage.HeaderBlock.<init>(HeaderBlock.java:140)
at org.apache.poi.poifs.storage.HeaderBlock.<init>(HeaderBlock.java:104)
at org.apache.poi.poifs.filesystem.POIFSFileSystem.<init>(POIFSFileSystem.java:138)
at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:322)
at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:303)
at excel.ReadAccountName.main(ReadAccountName.java:17)
I'm thinking of either of 2 following solutions -
Is it possible to convert the auto-generated .xls file to LEGIT .xls format, within the Java code only?
Some other way that I can read from the auto-generated .xls file?
If there are any other possible solutions to this, please suggest.
Upvotes: 1
Views: 1199
Reputation: 3807
If it's pure HTML, you can use Jsoup or another HTML parser to extract data from the source file, and then build a xls file by using POI.
Upvotes: 1