Nolesh
Nolesh

Reputation: 7028

C# regular expression - get row data from Excel formula with R1C1 references

I am trying to get references of rows in Excel using regex. For instance, I have a formula:

=SUM(R[-3]C[-1];R[-3]C;R[-3]C[2];R[-1]C[2]:R[1]C[3];RC[-1])

and I need to get numbers only -3, -3, -3, -1, 1, 0

Currently, I am using regex pattern:

=?(R\[[-0-9]*\]|RC)

It gives me:

result

But it's necessary to get numbers only. Moreover, instead of RC I have to obtain 0.

Thanks in advance!

Upvotes: 1

Views: 848

Answers (2)

Robin Mackenzie
Robin Mackenzie

Reputation: 19289

You are very close - if you add another capturing group in your regex then you can pull out the x from R[x]. So your regex will be:

=?(R\[([-0-9]*)\]|RC)

Note it's your regex with extra brackets around [-0-9]*.

Sample code:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;

namespace ConsoleApplication1
{
    class Program
    {
        static void Main(string[] args)
        {

            string formula = "=SUM(R[-3]C[-1];R[-3]C;R[-3]C[2];R[-1]C[2]:R[1]C[3];RC[-1])";
            string pattern = @"=?(R\[([-0-9]*)\]|RC)";
            System.Text.RegularExpressions.Regex parser = new System.Text.RegularExpressions.Regex(pattern);
            System.Text.RegularExpressions.MatchCollection matches;

            // parse formula
            matches = parser.Matches(formula);

            // iterate results
            foreach (System.Text.RegularExpressions.Match match in matches)
            {
                if (match.Groups[0].Value == "RC")
                {
                    Console.WriteLine("0");
                } else {
                    Console.WriteLine(match.Groups[2].Value);
                }
            }
            Console.ReadKey();

        }
    }
}

Upvotes: 0

Slai
Slai

Reputation: 22876

I can't test it, but you can get all of the rows referenced in a formula with something like this:

Range formulaCell = worksheet.Range("A1");
Range referencedRange = formulaCell.DirectPrecedents;

foreach(Range area in referencedRange.Areas)
    foreach(Range cell in area.Cells)
        Debug.Print((cell.Row - formulaCell.Row).ToString);    // -3, -3, -3, -1, 0, 1, 0

Upvotes: 1

Related Questions