Tassadaque
Tassadaque

Reputation: 8199

Remove Duplicate based on column value-linq

i have many to many relationship between employee and group. following linq statement

int[] GroupIDs = {6,7};


var result = from g in umGroups
    join empGroup in umEmployeeGroups on g.GroupID equals empGroup.GroupID
    where  GroupIDs.Contains(g.GroupID)                     
    select new {  GrpId = g.GroupID,EmployeeID = empGroup.EmployeeID };

returns groupid and the employeeid. and result is

GrpId  | EmployeeID
6      |   18
6      |   20  
7      |   19
7      |   20

I need to remove the rows for which the employeeid is repeating e.g. any one of the row with employeeid= 20
Thanks

Upvotes: 13

Views: 22049

Answers (1)

Jon Skeet
Jon Skeet

Reputation: 1500165

Okay, if you don't care which employee is removed, you could try something like:

var result = query.GroupBy(x => x.EmployeeId)
                  .Select(group => group.First());

You haven't specified whether this is in LINQ to SQL, LINQ to Objects or something else... I don't know what the SQL translation of this would be. If you're dealing with a relatively small amount of data you could always force this last bit to be in-process:

var result = query.AsEnumerable()
                  .GroupBy(x => x.EmployeeId)
                  .Select(group => group.First());

At that point you could actually use MoreLINQ which has a handy DistinctBy method:

var result = query.AsEnumerable()
                  .DistinctBy(x => x.EmployeeId);

Upvotes: 42

Related Questions