Luis Lavieri
Luis Lavieri

Reputation: 4129

LINQ Distinct with custom IEqualityComparer

So, I have a class like this:

public History
{
    int ProcessedImageId;
    string UserId;
    DateTime TimeStamp;
    ...   
}

From a LINQ query, I am getting every "History" within a range of time.

Now, I am also executing a second query to retrieve the images that were Processed on multiple dates.

This, works fine. This is an example of what I get.

enter image description here

Now, what I would like is to get this same query, but without repeating the ID of the image. So, if an image was processed multiple times, I'll just get the first time that was modified.

So, this is what I am trying:

#query holds the second query

var result = query.AsEnumerable().Distinct(new HistoryComparer());

And, my HistoryComparer looks like this:

public bool Equals(History x, History y)
{
    return x.ProcessedImageId == y.ProcessedImageId && x.TimeStamp != y.TimeStamp;
}

public int GetHashCode(History obj)
{
    return obj.TimeStamp.GetHashCode() ^ obj.ProcessedImageId.GetHashCode();
}

As you can see, I don't care about the date. That's why I am returning true if the dates are different. But, this is not working. I am still getting the same results. What can I do?

Thank you

Upvotes: 1

Views: 3872

Answers (1)

Sergey Kalinichenko
Sergey Kalinichenko

Reputation: 726609

In order for the equality comparer to work correctly, in addition to comparing items for equality per se, it must produce the same hash code for things that it considers identical. Your code has two implementation problems preventing it from getting you the expected results.

First problem with your implementation is that when the dates are the same, you declare the histories different:

public bool Equals(History x, History y) {
    return x.ProcessedImageId == y.ProcessedImageId && x.TimeStamp != y.TimeStamp;
    //                                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
}

Removing this part will restore equality comparison logic.

However, this is not enough, because you must deal with hash code as well. It must stop using the timestamp in its calculation:

public int GetHashCode(History obj) {
    return obj.ProcessedImageId.GetHashCode();
}

At this point the equality comparison boils down to comparing IDs.

Upvotes: 5

Related Questions