Atish Kumar Dipongkor
Atish Kumar Dipongkor

Reputation: 10422

How does Distinct() work on object list?

var people = new List<Person>
{
    new Person
    {
        Id = 1,
        Name = "Atish"
    },
    new Person
    {
        Id = 2,
        Name = "Dipongkor"
    },
    new Person
    {
        Id = 1,
        Name = "Atish"
    }
};

Console.WriteLine(people.Distinct().Count());

Why is the output 3?

Why is it not 2?

Upvotes: 2

Views: 1595

Answers (2)

Douglas
Douglas

Reputation: 54897

The default equality comparer for reference types is reference equality, which only returns true if two object references point to the same instance (i.e. were created via a single new statement). This is different than for value types, which test for value equality, and would return true if all their data fields are equal (like the two in your case). More info: Equality Comparisons (C# Programming Guide).

If you want to alter this behaviour, then you need to implement the generic IEquatable<T> interface on your type such that it compares the instances' properties for equality. The Distinct operator would subsequently automatically pick this implementation up and yield the expected results.

Edit: Here's a sample implementation of IEquatable<Person> for your class:

public class Person : IEquatable<Person>
{
    public int Id { get; set; }
    public int Name { get; set; }

    public bool Equals(Person other)
    {
        if (other == null)
            return false;

        return Object.ReferenceEquals(this, other) ||
            this.Id == other.Id &&
            this.Name == other.Name;
    }

    public override bool Equals(object obj)
    {
        return this.Equals(obj as Person);
    }

    public override int GetHashCode()
    {
        int hash = this.Id.GetHashCode();
        if (this.Name != null)
            hash ^= this.Name.GetHashCode();
        return hash;
    }
}

From the guidelines on overriding the == and != operators (emphasis added):

By default, the operator == tests for reference equality by determining whether two references indicate the same object. Therefore, reference types do not have to implement operator == in order to gain this functionality. When a type is immutable, that is, the data that is contained in the instance cannot be changed, overloading operator == to compare value equality instead of reference equality can be useful because, as immutable objects, they can be considered the same as long as they have the same value. It is not a good idea to override operator == in non-immutable types.

Upvotes: 6

It&#39;sNotALie.
It&#39;sNotALie.

Reputation: 22804

It's because you haven't overriden equality in your class. Right now, when you use distinct, it checks for reference equality. To change this, you need to override a fair few things: the operator==, Equals(), and for best results GetHashCode().

This is how I'd do it:

public static bool operator ==(Person one, Person two)
{
    return one.Id == two.Id && one.Name == two.Name;
}
public static override bool Equals(Person one, Person two)
{
    return one == two;
}
public override bool Equals(object obj)
{
    return obj is Person && ((Person)obj) == this;
}
public bool Equals(Person other)
{
    return other == this;
}
public override int GetHashCode()
{
    unchecked
    {
        return 17 * Id * 31 * Name.GetHashCode();
    }
}

Also, you could implement the IEquatable<T> interface (I have done above, all you need is to make sure that you add : IEquatable<Person> at the end of your class header (class Person etc.)) and then that will be implemented.

Upvotes: 3

Related Questions