Robert
Robert

Reputation: 1726

Recursive linq results returning duplicates

This question builds off of one I asked last week: Recursive linq to get infinite children. The answer given in that post produced what I needed; a distinct list of Locations and their children based on a parent. We needed to use our own model for Locations, so we created one, and since then, I've been getting duplicate results. Our model is very basic:

class LocationModel
{
    public int LocationID { get; set; }
    public int ParentLocationID { get; set; }
    public string LocationName { get; set;}
}

If you compare it to the entity created by EF, I just cut out all the fields we don't need/use (see link above). So I modified my linq statements to use this new model instead:

DBEntities db = new DBEntities();

public IEnumerable<LocationModel> GetAllChildLocations(int parentId)
{
    var locations = (from l in db.Locations
                        where l.ParentLocationID == parentId ||
                        l.LocationID == parentId
                        select new LocationModel()
                        {
                            LocationID = l.LocationID,
                            ParentLocationID = l.ParentLocationID,
                            LocationName = l.LocationName
                        }).ToList();

    var children = locations.AsEnumerable()
                            .Union(db.Locations.AsEnumerable()
                            .Where(x => x.ParentLocationID == parentId)
                            .SelectMany(y => GetAllChildLocations(y.LocationID)))
                            .ToList();
    return children.OrderBy(l => l.LocationName);
}

When I run it, either in Visual Studio or in LinqPad, I now get duplicates. Here's the original code that does not produce duplicates:

public IEnumerable<Location> GetAllChildLocations(int parentId)
{
    var locations = (from l in db.Locations
                        where l.ParentLocationID == parentId ||
                        l.LocationID == parentId
                        select l).ToList();

    var child = locations.AsEnumerable()
                        .Union(db.Locations.AsEnumerable()
                        .Where(x => x.ParentLocationID == parentId)
                        .SelectMany(y => GetAllChildLocations(y.LocationID)))
                        .ToList();
    return child;
}

Why is it producing duplicates when I use my own model vs. the generated one from EF? Does it have to do with the auto-generating fields that the EF model has and mine doesn't?

Upvotes: 0

Views: 95

Answers (1)

Ivan Stoev
Ivan Stoev

Reputation: 205859

Why is it producing duplicates when I use my own model vs. the generated one from EF?

Because you are using Enumerable.Union method which by default uses reference equality. EF DbContext change tracker keeps internally (tracks) the already loaded entity object instances with the same PK (even if you retrieve them via separate database queries), hence the reference equality works. Which cannot be said for the new LocationModel instances created by the query select operators.

One way to resolve it is to implement GetHashCode and Equals in your LocationModel class. But in general I don't like the implementation of the recursive children retrieval and the usage of Union - there must be a better way, but this is outside the scope of this question (but for the linked).

The root of the evil for me is the following condition

where l.ParentLocationID == parentId || l.LocationID == parentId

which selects both the item and its children, leading to duplicates in the result set, which then are supposed to be eliminated by the Union method. The good implementation will not generate duplicates at all.

Upvotes: 3

Related Questions