Javier Hertfelder
Javier Hertfelder

Reputation: 2432

Distinct in Entity framework

I have a List of objects that some of them have the same Ids, so I would like to remove those elements that are duplicated.

I tried with something like this:

List<post> posts = postsFromDatabase.Distinct().ToList();

But it doesn't work!

So I wrote this method in order to avoid the duplicates:

public List<Post> PostWithOutDuplicates(List<Post> posts)
    {
        List<Post> postWithOutInclude = new List<Post>();
        var noDupes = posts.Select(x => x.Id).Distinct();
        if (noDupes.Count() < posts.Count)
        {
            foreach (int idPost in noDupes)
            {
                postWithOutInclude.Add(posts.Where(x => x.Id == idPost).First());
            }
            return postWithOutInclude;
        }
        else
        {
            return posts;
        }
    }

Any ideas of how to improve the performance??

Thanx in advance.

Upvotes: 8

Views: 23400

Answers (3)

Jonathan
Jonathan

Reputation: 12015

I think that write your own custom comparer is a good approach.

Here is an article in msdn that explains the topic very well: http://support.microsoft.com/kb/320727

The reason that the Distinct are not working its that Distinct() has no idea about how to detemine if there are equals, so it's using the reference to determine it it's the same "object". It's working like it's suposed to work. All the classes in the query are not the same object.

By writing your own comparer (it's easy) you can tell to Distinct() how to make the comparation to determine if they are equals.

Edit: If not using Distinct isn't a problem and the situation isn't frecuent, the first answer of Piotr Justyna it's simple and effective.

Upvotes: 5

Piotr Justyna
Piotr Justyna

Reputation: 4996

This is nice and easy:

List<Post> posts = posts
.GroupBy(x => x.Id)
.Select(x => x.FirstOrDefault())

But if you want to write it the proper way, I'd advise you to write it like this:

public class PostComparer : IEqualityComparer<Post>
{
    #region IEqualityComparer<Post> Members

    public bool Equals(Post x, Post y)
    {
        return x.Id.Equals(y.Id);
    }

    public int GetHashCode(Post obj)
    {
        return obj.Id.GetHashCode();
    }

    #endregion
}

As it will give you more freedom when it comes to additional comparisons. having written this class you can use it like this:

List<Post> posts = postsFromDatabase.Distinct(new PostComparer()).ToList();

Upvotes: 32

Brian
Brian

Reputation: 2239

instead of .First(), try .FirstOrDefault()

Upvotes: 0

Related Questions