Reputation: 1450
Hi experts am trying to parse an excel file. its structure is very complex. The possible way i know are.
But the issue is of its complexity like some columns,cells or rows blank etc.
What are the best possible ways to do this ?
thanks in advance.
Upvotes: 2
Views: 1729
Reputation: 13077
It sounds like you have a good understanding of the task at hand. You'll have to write business logic to untangle the complexities of the spreadsheet format and extract the data you're looking for.
It seems to me that VTSO/Interop is the best platform strategy for 2 reasons:
Upvotes: 0
Reputation: 41298
I can recommend the ExcelDataReader (licensed under LGPL I think). It loads both .xls and .xlsx files, and lets you get the spreadsheet as a DataSet, with each worksheet being an individual DataTable. As far as I know from the scenarios I have used it in, it honours blank rows, empty cells, etc. Try it and see if you think it will handle your "very complex" structure. [I do notice one negative review on the site - but the rest are pretty positive. I've experienced an issue reading .xlsx if a worksheet is renamed]
I've also used the OLEDB approach in the past, but be warned that this has real problems in the way it tries to infer datatypes in the first few rows. If the datatype changes for a column, then this may well infer it wrongly. To make matters worse, when it does get it wrong, it will often return null as the value, making it difficult (or impossible) to tell a true null value from a datatype that changed after the first six or seven rows.
Upvotes: 1
Reputation: 139
I've used OLEDB myself to read uploaded Excel files, and its presents no real problems (except for nulls in fields, instead of blanks, which can be checked with IsDBNull). Also, third party open source tools like NPOI and Excel2007ReadWrite (http://www.codeproject.com/KB/office/OpenXML.aspx) can be useful.
I have thoroughly evaluated both of these third party tools, and both are pretty stable and easy to integrate. I would recommend NPOI for Excel 2003 files, and Excel2007ReadWrite for Excel 2007 files.
Upvotes: 0
Reputation: 63126
Personally i prefer to either use the OLEDB way, which is a bit clunky at best at times, or you can use a third party library that has put in the time/effort/energy to get access to the data.
SyncFusion has a pretty nice library for this.
Upvotes: 1
Reputation: 34563
I have my users first save the Excel spreadsheet as a CSV file. Then they upload the CSV file to my app. That makes it much simpler to parse.
Upvotes: 0