Reputation: 7278
I am using the following code to remove duplicate rows in a DataTable
based on the value of one field (keyField
)
IEnumerable<DataRow> uniqueContacts = dt.AsEnumerable()
.GroupBy(x => x[keyField].ToString())
.Select(g => g.First());
DataTable dtOut = uniqueContacts.CopyToDataTable();
How can I upgrade this code so that my LINQ removes duplicates based on the value of a list of fields. e.g. remove all rows that have the same 'firstname' and 'lastname'?
Upvotes: 1
Views: 1460
Reputation: 460028
You can use an anonymous type:
IEnumerable<DataRow> uniqueContacts = dt.AsEnumerable()
.GroupBy(row => new {
FirstName = row.Field<string>("FirstName"),
LastName = row.Field<string>("LastName")
})
.Select(g => g.First());
Since you want a solution that works with a List<string>
that is unknown at compile time you could use this class:
public class MultiFieldComparer : IEquatable<IEnumerable<object>>, IEqualityComparer<IEnumerable<object>>
{
private IEnumerable<object> objects;
public MultiFieldComparer(IEnumerable<object> objects)
{
this.objects = objects;
}
public bool Equals(IEnumerable<object> x, IEnumerable<object> y)
{
return x.SequenceEqual(y);
}
public int GetHashCode(IEnumerable<object> objects)
{
unchecked
{
int hash = 17;
foreach (object obj in objects)
hash = hash * 23 + (obj == null ? 0 : obj.GetHashCode());
return hash;
}
}
public override int GetHashCode()
{
return GetHashCode(this.objects);
}
public override bool Equals(object obj)
{
MultiFieldComparer other = obj as MultiFieldComparer;
if (other == null) return false;
return this.Equals(this.objects, other.objects);
}
public bool Equals(IEnumerable<object> other)
{
return this.Equals(this.objects, other);
}
}
and this extension method using this class:
public static IEnumerable<DataRow> RemoveDuplicates(this IEnumerable<DataRow> rows, IEnumerable<string> fields)
{
return rows
.GroupBy(row => new MultiFieldComparer(fields.Select(f => row[f])))
.Select(g => g.First());
}
then it's simple as:
List<string> columns = new List<string> { "FirstName", "LastName" };
var uniqueContacts = dt.AsEnumerable().RemoveDuplicates(columns).CopyToDataTable();
Upvotes: 2