Reputation: 11469
I'm parsing a large file using the SAX approach offered on: Parsing and Reading Large Excel Files with the Open XML SDK
This is my modified version (only getting the row number for simplicity)
using (SpreadsheetDocument myDoc = SpreadsheetDocument.Open("BigFile.xlsx", true))
{
WorkbookPart workbookPart = myDoc.WorkbookPart;
WorksheetPart worksheetPart = workbookPart.WorksheetParts.First();
OpenXmlReader reader = OpenXmlReader.Create(worksheetPart);
String rowNum;
while (reader.Read())
{
if (reader.ElementType == typeof(Row))
{
if (reader.HasAttributes)
rowNum = reader.Attributes.First(a => a.LocalName == "r").Value
}
}
}
The problem is that this loops through every item/cell/column/whatnot and only acts when the element type is Row.
Is there a SAX way to loop only through the rows and not every item in the worksheet?
Thanks,
Upvotes: 3
Views: 6714
Reputation: 11469
The key is to use the Skip()
and ReadNextSibling()
methods of the reader...
using (SpreadsheetDocument myDoc = SpreadsheetDocument.Open("BigFile.xlsx", true))
{
WorkbookPart workbookPart = myDoc.WorkbookPart;
WorksheetPart worksheetPart = workbookPart.WorksheetParts.First();
OpenXmlReader reader = OpenXmlReader.Create(worksheetPart);
String rowNum;
while (reader.Read())
{
if (reader.ElementType == typeof(Row))
{
do
{
if (reader.HasAttributes)
rowNum = reader.Attributes.First(a => a.LocalName == "r").Value;
} while (reader.ReadNextSibling()); // Skip to the next row
break; // We just looped through all the rows so no need to continue reading the worksheet
}
if (reader.ElementType != typeof(Worksheet)) // Dont' want to skip the contents of the worksheet
reader.Skip(); // Skip contents of any node before finding the first row.
}
}
Upvotes: 3