Reputation: 413
I am trying to extract all column names from an SQL calculation string. The data is held in a cell in a datatable and is determined by the square brackets surrounding the column. I can extract each instance of [*], but have just noticed that i have an issue. Some of the columns have table names and some have schema and table names. E.g. [column] [table].column] or [schema].[table].[column] How do i amend this code to pull everything between the first "[" and the last "]" that is not followed by a period?
The following code populates datatable1, then reads through to find any "column_calculation_formula" cell that contains a column name ([*]) and adds it to a duplicate of the row into a new datatable ready for upload. I tried appending the new row to the datatable, but this caused issues.
DataTable WBC_DTable = PopulateDataTable();
DataTable WBC_DTable2 = new DTSetup.WBDataTablesSetup().WBDSColumnsSetup();
Regex regex = new Regex(@"\[(.*?)\]");
long Rows = WBC_DTable.Rows.Count;
foreach (DataRow row in WBC_DTable.Rows)
{
string InputString = row["column_calculation_formula"].ToString();
MatchCollection matches = regex.Matches(InputString);
List<string> Output = new List<string>();
foreach (Match m in matches)
{
Output.Add(m.Value);
TextBoxAppend("Before: " + m.Value);
}
List<string> distinct = Output.Distinct().ToList();
foreach (string value in distinct)
{
TextBoxAppend("After: " + value);
if (distinct.Count() > 0)
{
DataRow NewRow = WBC_DTable2.NewRow();
NewRow.ItemArray = row.ItemArray;
NewRow["column_datatype"] = "SubColumn";
NewRow["column_name"] = value;
WBC_DTable2.Rows.Add(NewRow);
}
}
DataRow NewRow2 = WBC_DTable2.NewRow();
NewRow2.ItemArray = row.ItemArray;
WBC_DTable2.Rows.Add(NewRow2);
}
Fairly confident that this should just be an alteration of the Regex call, but i dont know enough about Regex to be able to do this.
Any help would be greatly appreciated.
Upvotes: 2
Views: 1136
Reputation: 626920
You may use
var res = Regex.Matches(s, @"\[.*?](?!\.)")
.Cast<Match>().
Select(x => x.Value)
.ToList()
Details
\[
- a literal [
.*?
- any 0+ chars other than a newline, as few as possible](?!\.)
- a ]
char not immediately followed with a .
.Upvotes: 3