Omer Bokhari
Omer Bokhari

Reputation: 59578

LINQ - Group DataTable by multiple columns determined at runtime

Using .NET 3.5, I have the need to group a DataTable by multiple columns, where the column names are contained in an IEnumerable.

// column source
IEnumerable<string> columns;
DataTable table;

IEnumerable<IGrouping<object, DataRow>> groupings = table
    .AsEnumerable()
    .GroupBy(row => ???);

Typically ??? would be an anonymous type as described here, but I need to use columns as the column source. Is this possible?

Upvotes: 3

Views: 1208

Answers (1)

Kirk Broadhurst
Kirk Broadhurst

Reputation: 28718

The simplest way to do this is to create a function which selects the required columns and creates a hash for comparison. I'd do something like this:

Func<DataRow, IEnumerable<string>, string> f = (row, cols) => 
    String.Join("|", cols.Select(col => row[col]));

This is a function taking a DataRow and an IEnumerable<string>. It projects the IEnumerable<string> (the column names) into the corresponding column values (cols.Select(col => row[col]))), and then joins those values with a | character. I chose this character because it's an unlikely candidate to be included in your fields, but you might want to swap for another delimiter.

Then simply

IEnumerable<IGrouping<object, DataRow>> groupings = table
    .AsEnumerable()
    .GroupBy(row => f(row, columns));

Ideally we would select into a better type - not a string tied together with a arbitrary delimiter. But I expect that selecting into an object will cause problems due to the comparison of reference types, as two objects aren't equal even if they have identical properties.

Upvotes: 2

Related Questions