Tech Learner
Tech Learner

Reputation: 1317

How to skip empty cells while reading data from Excel using OpenXML?

I am trying to read data from Excel and store it into a DataTable using OpenXML. I want data in my DataTable as it is in Excel sheet but when there is a empty cell in Excel, it was not looking as expected.

Because code row.Descendants<Cell>().ElementAt(i) skips empty cells while reading data and in DataTable Rows and Columns are stored incorrectly. I resolved this issue using below code but when my excel has more than 26 columns, it is not working as expected and again data are stored in DataTable incorrectly. (i.e., While reading data from AA, AB, AC columns)

Can someone help me to rewrite this code to handle this issue when there is more than 26 columns.

    private static int CellReferenceToIndex(Cell cell)
    {
        int index = 0;
        string reference = cell.CellReference.ToString().ToUpper();
        foreach (char ch in reference)
        {
            if (Char.IsLetter(ch))
            {
                int value = (int)ch - (int)'A';
                index = (index == 0) ? value : ((index + 1) * 26) + value;
            }
            else
            {
                return index;
            }
        }
        return index;
    }

Upvotes: 0

Views: 1530

Answers (1)

Auditive
Auditive

Reputation: 2049

You can use example below (taken from here and improved by few validations):

public static int GetColumnIndex(this Cell cell)
{
    string columnName = string.Empty;

    if (cell != null)
    {
        string cellReference = cell.CellReference?.ToString();

        if (!string.IsNullOrEmpty(cellReference))
            // Using `Regex` to "pull out" only letters from cell reference
            // (leave only "AB" column name from "AB123" cell reference)
            columnName = Regex.Match(cellReference, @"[A-Z]{1,3}").Value;
    }
 
    // Column name validations (not null, not empty and contains only UPPERCASED letters)
    // *uppercasing may be done manually with columnName.ToUpper()
    if (string.IsNullOrEmpty(columnName))
        throw new ArgumentException("Column name was not defined.", nameof(columnName));
    else if (!Regex.IsMatch(columnName, @"^[A-Z]{1,3}$"))
        throw new ArgumentException("Column name is not valid.", nameof(columnName));

    int index = 0;
    int pow = 1;

    // A - 1 iteration, AA - 2 iterations, AAA - 3 iterations.
    // On each iteration pow value multiplies by 26
    // Letter number (in alphabet) + 1 multiplied by pow value
    for (int i = columnName.Length - 1; i >= 0; i--)
    {
        index += (columnName[i] - 'A' + 1) * pow;
        pow *= 26;
    }

    // Index couldn't be greater than 16384
    if (index >= 16384)
        throw new IndexOutOfRangeException("Index of provided column name (" + index + ") exceeds max range (16384).");

    return index;
}

All exception throws you can replace with return -1 and some kind of Log("...") if you have logging. Otherwise you may not be sure what's problem happened and why was returned -1.

Usage is obvious:

var cells = row.Descendants<Cell>();

foreach (Cell cell in cells)
{
    int columnIndex = cell.GetColumnIndex();
    // Do what you want with that
}

EDIT.

I'm not sure what you're trying to achieve. And what you mean here:

Because code row.Descendants<Cell>().ElementAt(i) skips empty cells...

I didn't see that. Look at example below:

enter image description here enter image description here

Random ElementAt in range between 0 and Descendants<Cell>().Count() works too and shows both empty and non-empty cells: enter image description here

Upvotes: 0

Related Questions