Mark
Mark

Reputation: 150

OpenXML, SAX, and Simply Reading an Xlsx file

I have been struggling to find a solution on how to read a large xlsx file with OpenXml. I have tried the microsoft samples without luck. I simply need to read an excel file into a DataTable in c#. I am not concerned with value types in the datatable, everything can be stored as a string values.

The samples I have found so far don't retain the structure of the spreadsheet and only return the values of the cells.

Any ideas?

Upvotes: 2

Views: 4748

Answers (2)

Dmitry Pavlov
Dmitry Pavlov

Reputation: 28290

You can get DataTable this way:

using (SpreadsheetDocument spreadsheet = SpreadsheetDocument.Open(fileName, false))
{
    DataTable data = ToDataTable(spreadsheet, "Employees");
}

This method will read Excel sheet data as DataTable

public DataTable ToDataTable(SpreadsheetDocument spreadsheet, string worksheetName)
{
    var workbookPart = spreadsheet.WorkbookPart;

    var sheet = workbookPart
        .Workbook
        .Descendants<Sheet>()
        .FirstOrDefault(s => s.Name == worksheetName);

    var worksheetPart = sheet == null
        ? null
        : workbookPart.GetPartById(sheet.Id) as WorksheetPart;

    var dataTable = new DataTable();

    if (worksheetPart != null)
    {
        var sheetData = worksheetPart.Worksheet.GetFirstChild<SheetData>();

        foreach (Row row in sheetData.Descendants<Row>())
        {
            var values = row
                .Descendants<Cell>()
                .Select(cell =>
                {
                    var value = cell.CellValue.InnerXml;
                    if (cell.DataType != null && cell.DataType.Value == CellValues.SharedString)
                    {
                        value = workbookPart
                            .SharedStringTablePart
                            .SharedStringTable
                            .ChildElements[int.Parse(value)]
                            .InnerText;
                    }
                    return (object)value;
                })
                .ToArray();

            dataTable.Rows.Add(values);
        }
    }

    return dataTable;
}

Upvotes: 0

Bryan Roberts
Bryan Roberts

Reputation: 3479

The open xml SDK can be a little hard to understand. However, I have found it useful to use http://simpleooxml.codeplex.com/ this code plex project. It adds a thin layer over the sdk to more easily parse through excel files and work with styles.

Then you can use something like the following with their worksheet reader to recurse through and grab the values you want

System.IO.MemoryStream ms = Utility.StreamToMemory(xslxTemplate);
using (SpreadsheetDocument document = SpreadsheetDocument.Open(ms, true))
{
    IEnumerable<Sheet> sheets = document.WorkbookPart.Workbook.GetFirstChild<Sheets>().Elements<Sheet>();
    if (sheets.Count() == 0)
    {
        // The specified worksheet does not exist.
        return null;
    }
    string relationshipId = sheets.First().Id.Value;
    WorksheetPart worksheetPart = (WorksheetPart)document.WorkbookPart.GetPartById(relationshipId);
    string myval =WorksheetReader.GetCell("A", 0, worksheetPart).CellValue.InnerText;
    // Put in a loop to go through contents of document
}

Upvotes: 1

Related Questions