Adam Shakhabov
Adam Shakhabov

Reputation: 1346

Parsing cell value of Excel spreadsheet

I parse cell located in A2 address. This returns 3 value instead of the expected Category 1.

test.xlsx

enter image description here

using System;
using System.Linq;
using DocumentFormat.OpenXml.Packaging;
using X = DocumentFormat.OpenXml.Spreadsheet;

namespace DotNetSandbox.SO
{
    public class IncorrectCellValue
    {
        public static void ParseCellValue()
        {
            using SpreadsheetDocument doc = SpreadsheetDocument.Open(@"c:\temp\test.xlsx", false);
            X.Sheet sheet = doc.WorkbookPart.Workbook.Descendants<X.Sheet>().First();
            WorksheetPart wsPart = (WorksheetPart)doc.WorkbookPart.GetPartById(sheet.Id);
            X.Cell cell = wsPart.Worksheet.Descendants<X.Cell>().First(c => c.CellReference == "A2");

            string cellValue = cell.CellValue.Text;

            Console.WriteLine(cellValue);
            Console.ReadKey();
        }
    }
}

OUTPUT:

3


I do something wrong or maybe is it a library bug?

Upvotes: 1

Views: 826

Answers (1)

hsn-mnj
hsn-mnj

Reputation: 1388

Use this method

 public static string GetCellValue(string fileName,
        string addressName, string sheetName = "")
    {
        string value = null;

        // Open the spreadsheet document for read-only access.
        using (SpreadsheetDocument document =
            SpreadsheetDocument.Open(fileName, false))
        {
            // Retrieve a reference to the workbook part.
            WorkbookPart wbPart = document.WorkbookPart;

            // Find the sheet with the supplied name, and then use that 
            // Sheet object to retrieve a reference to the first worksheet.
            var theSheets = wbPart.Workbook.Descendants<Sheet>();
            Sheet theSheet = string.IsNullOrEmpty(sheetName) ? theSheets.FirstOrDefault() : theSheets.FirstOrDefault(x => x.Name == sheetName);

            // Throw an exception if there is no sheet.
            if (theSheet == null)
            {
                throw new ArgumentException("sheetName");
            }

            // Retrieve a reference to the worksheet part.
            WorksheetPart wsPart =
                (WorksheetPart)(wbPart.GetPartById(theSheet.Id));

            // Use its Worksheet property to get a reference to the cell 
            // whose address matches the address you supplied.
            Cell theCell = wsPart.Worksheet.Descendants<Cell>().
              Where(c => c.CellReference == addressName).FirstOrDefault();

            // If the cell does not exist, return an empty string.
            if (theCell.InnerText.Length > 0)
            {
                value = theCell.InnerText;

                // If the cell represents an integer number, you are done. 
                // For dates, this code returns the serialized value that 
                // represents the date. The code handles strings and 
                // Booleans individually. For shared strings, the code 
                // looks up the corresponding value in the shared string 
                // table. For Booleans, the code converts the value into 
                // the words TRUE or FALSE.
                if (theCell.DataType != null)
                {
                    switch (theCell.DataType.Value)
                    {
                        case CellValues.SharedString:

                            // For shared strings, look up the value in the
                            // shared strings table.
                            var stringTable =
                                wbPart.GetPartsOfType<SharedStringTablePart>()
                                .FirstOrDefault();

                            // If the shared string table is missing, something 
                            // is wrong. Return the index that is in
                            // the cell. Otherwise, look up the correct text in 
                            // the table.
                            if (stringTable != null)
                            {
                                value =
                                    stringTable.SharedStringTable
                                    .ElementAt(int.Parse(value)).InnerText;
                            }
                            break;

                        case CellValues.Boolean:
                            switch (value)
                            {
                                case "0":
                                    value = "FALSE";
                                    break;
                                default:
                                    value = "TRUE";
                                    break;
                            }
                            break;
                    }
                }
            }
        }
        return value;
    }

You stuck here:

If the cell represents an integer number, you are done. For dates, this code returns the serialized value that represents the date. The code handles strings and Booleans individually. For shared strings, the code looks up the corresponding value in the shared string table. For Booleans, the code converts the value into the words TRUE or FALSE.

I was able to get Category 1 by running this code:

 var cellValue = GetCellValue(@"c:\test.xlsx", "A2");

Microsoft Doc

Notice that I changed the original method to get the first sheet if you do not pass the sheet name to the method.

What is Shared String: To optimize the use of strings in a spreadsheet, SpreadsheetML stores a single instance of the string in a table called the shared string table. The cells then reference the string by index instead of storing the value inline in the cell value. Excel always creates a shared string table when it saves a file.

Upvotes: 1

Related Questions