mattpm
mattpm

Reputation: 1390

Read XLSX file in C#

I have an Excel Workbook I am trying to read using DocumentFormat.OpenXml.Spreadsheet; using DocumentFormat.OpenXml.Packaging;

but I'm unsure on how to know when the value is a date. Shared strings and numbers are fine but the type associated with the cell (Cell.DataType) is always a shared string.

I've had a look at alternative offerings but they all seem to cover strings and numbers but not dates.

Any ideas.?

Upvotes: 0

Views: 1667

Answers (2)

cvraman
cvraman

Reputation: 1697

Open XML stores dates as the number of days from 1 Jan 1900. What you would need to do is get the string from the cell and then parse that string into a DateTime object.

The snippets below illustrate this. Notice the use of DateTime.FromOADate overload

public static string GetCellValue(WorkbookPart wbPart, Cell cell)
{
    string value = string.Empty;
    if (cell != null && cell.CellValue != null)
    {
        value = cell.InnerText;
        if (cell.DataType != null)
        {
            switch (cell.DataType.Value)
            {
                case CellValues.SharedString:
                    var stringTable = wbPart.GetPartsOfType<SharedStringTablePart>().FirstOrDefault();
                    if (stringTable != null)
                    {
                        value = stringTable.SharedStringTable.ElementAt(int.Parse(value)).InnerText;
                    }
                    break;

                case CellValues.Boolean:
                    if (value == "0")
                        value = "FALSE";
                    else
                        value = "TRUE";

                    break;
            }
        }
    }
    return value;
}


string value = GetCellValue(wbPart,cell);
var date = DateTime.FromOADate(double.Parse(value));

Upvotes: 2

mattpm
mattpm

Reputation: 1390

I believe I have an answer of sorts. Within the Styles file, there is a dxfs node with one or more dxf nodes which give the formatting. I haven't worked out how to get from the cell within the worksheet to the dxf node. But once this relationship is determined, the formatting of the cell should be then available.

Having said that, it appears that at best, you are unlikely to know for sure whether a cell is a date, but with some tests within code, you could probably give it a reasonable best guess.

I've opted to capture schema information in a mapping document instead so I know for sure.

Upvotes: 0

Related Questions