kh25
kh25

Reputation: 1298

Linq multiple where queries

I have an issue building a fairly hefty linq query. Basically I have a situation whereby I need to execute a subquery in a loop to filter down the number of matches that are returned from the database. Example code is in this loop below:

        foreach (Guid parent in parentAttributes)
        {
            var subQuery = from sc in db.tSearchIndexes
                           join a in db.tAttributes on sc.AttributeGUID equals a.GUID
                           join pc in db.tPeopleIndexes on a.GUID equals pc.AttributeGUID
                           where a.RelatedGUID == parent && userId == pc.CPSGUID                             
                           select sc.CPSGUID;

            query = query.Where(x => subQuery.Contains(x.Id));
         }

When I subsequently call the ToList() on the query variable it appears that only a single one of the subqueries has been performed and I'm left with a bucketful of data I don't require. However this approach works:

       IList<Guid> temp = query.Select(x => x.Id).ToList();

        foreach (Guid parent in parentAttributes)
        {
            var subQuery = from sc in db.tSearchIndexes
                           join a in db.tAttributes on sc.AttributeGUID equals a.GUID
                           join pc in db.tPeopleIndexes on a.GUID equals pc.AttributeGUID
                           where a.RelatedGUID == parent && userId == pc.CPSGUID                             
                           select sc.CPSGUID;

            temp = temp.Intersect(subQuery).ToList();
        }

        query = query.Where(x => temp.Contains(x.Id));

Unfortunately this approach is nasty as it results in multiple queries to the remote database whereby the initial approach if I could get it working would only result in a single hit. Any ideas?

Upvotes: 10

Views: 11314

Answers (2)

driis
driis

Reputation: 164281

I think you are hitting a special case of capturing the loop variable in the lambda expression used to filter. Also known as an access to modified closure error.

Try this:

   foreach (Guid parentLoop in parentAttributes)
    {
        var parent = parentLoop;
        var subQuery = from sc in db.tSearchIndexes
                       join a in db.tAttributes on sc.AttributeGUID equals a.GUID
                       join pc in db.tPeopleIndexes on a.GUID equals pc.AttributeGUID
                       where a.RelatedGUID == parent && userId == pc.CPSGUID                             
                       select sc.CPSGUID;

        query = query.Where(x => subQuery.Contains(x.Id));
     }

The problem is capturing the parent variable in the closure (that the LINQ syntax is converted to), which causes all the subQueryes to be run with the same parent id.

What happens is the compiler generating a class to hold the delegate and the local variables the delegate accesses. The compiler re-uses the same instance of that class for each loop; and therefore, once the query executes, all of the Wheres executes with the same parent Guid, namely the last to execute.

Declaring the parent inside the loop scope causes the compiler to essentially make a copy of the variable, with the correct value, to be captured.

This can be a bit hard to grasp at first, so if this is the first time it has hit you; I'd recommend these two articles for background and a thorough explanation:

Upvotes: 8

k06a
k06a

Reputation: 18745

Maybe this way?

var subQuery = from sc in db.tSearchIndexes
               join a in db.tAttributes on sc.AttributeGUID equals a.GUID
               join pc in db.tPeopleIndexes on a.GUID equals pc.AttributeGUID
               where parentAttributes.Contains(a.RelatedGUID) && userId == pc.CPSGUID                             
               select sc.CPSGUID;

Upvotes: 0

Related Questions