Manuel
Manuel

Reputation: 11469

Is there a SAX way to loop through OpenXML rows?

I'm parsing a large file using the SAX approach offered on: Parsing and Reading Large Excel Files with the Open XML SDK

This is my modified version (only getting the row number for simplicity)

using (SpreadsheetDocument myDoc = SpreadsheetDocument.Open("BigFile.xlsx", true))
{
    WorkbookPart workbookPart = myDoc.WorkbookPart;
    WorksheetPart worksheetPart = workbookPart.WorksheetParts.First();

    OpenXmlReader reader = OpenXmlReader.Create(worksheetPart);
    String rowNum;
    while (reader.Read())
    {
        if (reader.ElementType == typeof(Row))
        {
            if (reader.HasAttributes)
                rowNum = reader.Attributes.First(a => a.LocalName == "r").Value
        }
    }
}

The problem is that this loops through every item/cell/column/whatnot and only acts when the element type is Row.

Is there a SAX way to loop only through the rows and not every item in the worksheet?

Thanks,

Upvotes: 3

Views: 6714

Answers (1)

Manuel
Manuel

Reputation: 11469

The key is to use the Skip() and ReadNextSibling() methods of the reader...

 using (SpreadsheetDocument myDoc = SpreadsheetDocument.Open("BigFile.xlsx", true))
 {
     WorkbookPart workbookPart = myDoc.WorkbookPart;
     WorksheetPart worksheetPart = workbookPart.WorksheetParts.First();
     OpenXmlReader reader = OpenXmlReader.Create(worksheetPart);
     String rowNum;
     while (reader.Read())
     {
         if (reader.ElementType == typeof(Row))
         {
             do
             {
                 if (reader.HasAttributes)
                     rowNum = reader.Attributes.First(a => a.LocalName == "r").Value;

             } while (reader.ReadNextSibling()); // Skip to the next row
             break; // We just looped through all the rows so no need to continue reading the worksheet
         }

         if (reader.ElementType != typeof(Worksheet)) // Dont' want to skip the contents of the worksheet
             reader.Skip(); // Skip contents of any node before finding the first row.
     }
}

Upvotes: 3

Related Questions