Reputation: 109

How to use LINQ to find a duplicate object

I have an Employee class that has the following:

public class Employee
{
  //EmployeeNumber cannot be the same as the Id
  public int EmployeeNumber {get; set; }
  public string EmployeeName {get; set }
}

Ultimately I am going to be updating the database with new employees. I have a list of the new employees, and I have a list of the current employees that exist in the database. Employee name can be the same, but EmployeeNumber has to be unique. I want to ultimately have a list of duplicate employees that has been created from comparing the list I will be adding to the database, with the list of employees that represents what is inside the database.

What is the best way to get a list of the duplicate employees using LINQ?

Upvotes: 2

Answers (4)

user2896170

Reputation:

You can implement IEqualityComparer and use the LinQ method "Except"

public class MyComparer : IEqualityComparer<Employee>
{
    public bool Equals(Employee x, Employee y)
    {
        return x.EmployeeNumber.Equals(y.EmployeeNumber);
    }

    public int GetHashCode(Employee x)
    {
        return x.EmployeeNumber.GetHashCode()
    }
}

Upvotes: 1

Michael Blackburn

Reputation: 3229

I'm not sure if it's the most efficient (That award goes to @Fahad's comment).

Assuming you really mean "How do I get a list of records that appear in two collections," I like to use the Join or GroupJoin methods, as you can select a new collection, or an anonymous type that contains the record from both collections.

The Syntax is Join (this collection1, collection2, FuncCollection1Key, Funccollection2Key, FuncOutputSelection).

So if your "new" collection is IEnumerable NewEmployees and your existing collection is IEnumerable<Employee> DbEmployees your collection of duplicate employees is derived by:

var DupeEmployees = NewEmployees.Join(DbEmployees, n=>n.EmployeeNumber, d=>d.EmployeeNumber, (nEmp,dbEmp)=>nEmp);

The two "middle" lambda expressions must be functions that result in the same type of value (that implements IEquatable), but there's no other restriction. You have have two collections of different types, and you can output anything you like.

Now, the best way to do this is Farhad's suggestion, using Join in this case is a bit like shooting a bee with an Elephant gun, but understanding Join will return you many benefits down the road.

Upvotes: 1

laskdjf

Reputation: 1183

The correct way to do it would be to declare the EmployeeNumber as the table key, then there is not need to check for duplicates.

public class Employee
{
  [Key]
  public int EmployeeNumber {get; set; }
  public string EmployeeName {get; set }
}

Also in your database you would declare the EmployeeNumber as the primary key. Assuming you are using SQL Server, you can add Identity(1,1) to make it auto-increment.

Here is a sample of how your table definition might look:

CREATE TABLE Persons
(
  EmployeeNumber int IDENTITY(1,1) PRIMARY KEY,
  EmployeeName varchar(255) NOT NULL,
)

Upvotes: 2

Bhushan Firake

Reputation: 9448

You can just check whether EmployeeNumber of current employee is available in the list of new employees.

 List<Employee> currentEmployees = ...
 List<Employee> newEmployees = ...

 List<Employee> duplicateEmployees = currentEmployees.Where(currentEmployee => (newEmployees.Select(f => f.EmployeeNumber)).Contains(currentEmployee.EmployeeNumber)).ToList();

Upvotes: 0

How to use LINQ to find a duplicate object

Answers (4)

Related Questions