Reputation: 6768
I need to read 50MB .XLS excel but im getting this error:
System.IO.FileFormatException {"File contains corrupted data."}
I choose OpenXMLReader class for reading, because it seems to fit on this scenario. But it crash on the first line of code.
SpreadsheetDocument.Open(filename, false)
The Excel file can be downloaded here: http://www.filedropper.com/saxfile its simple 97-2003 excel test file with 10 rows
static void Main(string[] args)
{
ReadExcelFileSAX(@"C:\Users\User\Desktop\SAXFile.xls");
}
static void ReadExcelFileSAX(string filename)
{
using (SpreadsheetDocument myDoc = SpreadsheetDocument.Open(filename, false))
{
WorkbookPart workbookPart = myDoc.WorkbookPart;
WorksheetPart worksheetPart = workbookPart.WorksheetParts.First();
OpenXmlReader reader = OpenXmlReader.Create(worksheetPart);
string text;
while (reader.Read())
{
if (reader.ElementType == typeof(CellValue))
{
text = reader.GetText();
Console.WriteLine(text);
}
}
}
}
What am i missing ?
Upvotes: 1
Views: 6956
Reputation: 11
Use ExcelDataReader for xls and xlsx.
public static DataTable XLSFileStreamReader(string filePath)
{
try
{
FileStream stream = new FileStream(filePath, FileMode.Open);
//IExcelDataReader excelReader = ExcelReaderFactory.CreateBinaryReader(stream);
IExcelDataReader excelReader = null;
excelReader = ExcelReaderFactory.CreateReader(stream);
DataSet result = excelReader.AsDataSet();
excelReader.Close();
return result.Tables[0];
}
catch (Exception e)
{
throw;
}
}
Upvotes: 0
Reputation: 6768
This code before using OpenXML will help (it save xls as xlsx), but its little bit workaround
static void ConvertXlsToXlsx(string xlsFilePath, string xlsxFilePath)
{
Microsoft.Office.Interop.Excel.Application excelApp = new Microsoft.Office.Interop.Excel.Application();
excelApp.Visible = false;
Microsoft.Office.Interop.Excel.Workbook eWorkbook = excelApp.Workbooks.Open(xlsFilePath, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing);
eWorkbook.SaveAs(xlsxFilePath, Microsoft.Office.Interop.Excel.XlFileFormat.xlOpenXMLWorkbook, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Microsoft.Office.Interop.Excel.XlSaveAsAccessMode.xlNoChange, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing);
eWorkbook.Close(false, Type.Missing, Type.Missing);
}
Better solution is maybe to use different library ExcelDataReader https://exceldatareader.codeplex.com/ which i tested and work (example print .CSV to Console)
static void XLSFileStreamReader(string filePath)
{
FileStream stream = new FileStream(filePath, FileMode.Open);
// Reading from a binary Excel file ('97-2003 format; *.xls)
IExcelDataReader excelReader2003 = ExcelReaderFactory.CreateBinaryReader(stream);
// DataSet - The result of each spreadsheet will be created in the result.Tables
DataSet result = excelReader2003.AsDataSet();
// Data Reader methods
foreach (DataTable table in result.Tables)
{
for (int i = 0; i < table.Rows.Count; i++)
{
for (int j = 0; j < table.Columns.Count; j++)
Console.Write("\"" + table.Rows[i].ItemArray[j] + "\";");
Console.WriteLine();
}
}
// Free resources (IExcelDataReader is IDisposable)
excelReader2003.Close();
}
i found it here https://gist.github.com/Munawwar/924413
PS: FileStream maybe should be in USING block.
Upvotes: 0
Reputation: 28863
The file you're trying to open is a pre-OpenXML binary format Excel file.
I'm sure the reader raises an error as soon as it detects it is not a valid ZIP archive.
Upvotes: 4