DDuffy
DDuffy

Reputation: 413

Need to extract column names from string in c#

I am trying to extract all column names from an SQL calculation string. The data is held in a cell in a datatable and is determined by the square brackets surrounding the column. I can extract each instance of [*], but have just noticed that i have an issue. Some of the columns have table names and some have schema and table names. E.g. [column] [table].column] or [schema].[table].[column] How do i amend this code to pull everything between the first "[" and the last "]" that is not followed by a period?

The following code populates datatable1, then reads through to find any "column_calculation_formula" cell that contains a column name ([*]) and adds it to a duplicate of the row into a new datatable ready for upload. I tried appending the new row to the datatable, but this caused issues.

DataTable WBC_DTable = PopulateDataTable();
DataTable WBC_DTable2 = new DTSetup.WBDataTablesSetup().WBDSColumnsSetup();
Regex regex = new Regex(@"\[(.*?)\]");

long Rows = WBC_DTable.Rows.Count;
foreach (DataRow row in WBC_DTable.Rows)
{
    string InputString = row["column_calculation_formula"].ToString();
    MatchCollection matches = regex.Matches(InputString);
    List<string> Output = new List<string>();
    foreach (Match m in matches)
    {
        Output.Add(m.Value);
        TextBoxAppend("Before: " + m.Value);
    }
    List<string> distinct = Output.Distinct().ToList();
    foreach (string value in distinct)
    {
        TextBoxAppend("After: " + value);

        if (distinct.Count() > 0)
        {
            DataRow NewRow = WBC_DTable2.NewRow();
            NewRow.ItemArray = row.ItemArray;
            NewRow["column_datatype"] = "SubColumn";
            NewRow["column_name"] = value;
            WBC_DTable2.Rows.Add(NewRow);
        }
    }
    DataRow NewRow2 = WBC_DTable2.NewRow();
    NewRow2.ItemArray = row.ItemArray;
    WBC_DTable2.Rows.Add(NewRow2);
}

Fairly confident that this should just be an alteration of the Regex call, but i dont know enough about Regex to be able to do this.

Any help would be greatly appreciated.

Upvotes: 2

Views: 1136

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626920

You may use

var res = Regex.Matches(s, @"\[.*?](?!\.)")
    .Cast<Match>().
    Select(x => x.Value)
    .ToList()

See the online pattern demo.

Details

  • \[ - a literal [
    • .*? - any 0+ chars other than a newline, as few as possible
  • ](?!\.) - a ] char not immediately followed with a ..

Upvotes: 3

Related Questions