Mike Christensen
Mike Christensen

Reputation: 91598

LINQ query to select rows matching an array of pairs

Right now, I have a class called TrainingPlan that looks like this:

public class TrainingPlan
{
   public int WorkgroupId { get; set; }
   public int AreaId { get; set; }
}

I'm given an array of these instances, and need to load the matching training plans from the database. The WorkgroupId and AreaId basically form a compound key. What I'm doing now is looping through each TrainingPlan like so:

foreach (TrainingPlan plan in plans)
   LoadPlan(pid, plan.AreaId, plan.WorkgroupId);

Then, LoadPlan has a LINQ query to load the individual plan:

var q = from tp in context.TPM_TRAININGPLAN.Include("TPM_TRAININGPLANSOLUTIONS")
        where tp.PROJECTID == pid && tp.AREAID == areaid &&
              tp.WORKGROUPID == workgroupid
              select tp;

return q.FirstOrDefault();

The Problem:

This works, however it's very slow for a large array of plans. I believe this could be much faster if I could perform a single LINQ query to load in every TPM_TRAININGPLAN at once.

My Question:

Given an array of TrainingPlan objects, how can I load every matching WorkgroupId/AreaId combination at once? This query should translate into similar SQL syntax:

SELECT * FROM TPM_TRAININGPLANS
WHERE (AREAID, WORKGROUPID) IN ((1, 2), (3, 4), (5, 6), (7, 8));

Upvotes: 4

Views: 2106

Answers (2)

Chris
Chris

Reputation: 707

It seems to me that using Intersect() may get this done the way that you want. But, I don't have an environment set up to test this myself.

var q = (from tp in context.TPM_TRAININGPLAN.Include("TPM_TRAININGPLANSOLUTIONS")
        where pid == tp.PROJECTID
        select tp)
        .Intersect
        (from tp in context.TPM_TRAININGPLAN.Include("TPM_TRAININGPLANSOLUTIONS")
        where plans.Any(p => p.AreaID == tp.AREAID)
        select tp)
        .Intersect
        (from tp in context.TPM_TRAININGPLAN.Include("TPM_TRAININGPLANSOLUTIONS")
        where plans.Any(p => p.WorkgroupId == tp.WORKGROUPID)
        select tp);

My only concern might be that Intersect could cause it to load more records in memory than you would want, but I'm unable to test to confirm if that's the case.

Upvotes: 0

P.Brian.Mackey
P.Brian.Mackey

Reputation: 44275

I've used Contains to run a bulk filter similar to where-in. I setup a rough approximation of your scenario. The single select queries actually ran quicker than Contains did. I recommend running a similar test on your end with the DB tied in to see how your results wind up. Ideally see how it scales too. I'm running .NET 4.0 in visual studio 2012. I jammed in ToList() calls to push past potential lazy loading problems.

public class TrainingPlan
{
    public int WorkgroupId { get; set; }
    public int AreaId { get; set; }

    public TrainingPlan(int workGroupId, int areaId)
    {
        WorkgroupId = workGroupId;
        AreaId = areaId;
    }    
}

public class TrainingPlanComparer : IEqualityComparer<TrainingPlan>
{
    public bool Equals(TrainingPlan x, TrainingPlan y)
    {
        //Check whether the compared objects reference the same data. 
        if (x.WorkgroupId == y.WorkgroupId && x.AreaId == y.AreaId) 
            return true;

        return false;                        
    }

    public int GetHashCode(TrainingPlan trainingPlan)
    {            
        if (ReferenceEquals(trainingPlan, null)) 
            return 0;

        int wgHash = trainingPlan.WorkgroupId.GetHashCode();
        int aHash = trainingPlan.AreaId.GetHashCode();

        return wgHash ^ aHash;
    }
}


internal class Class1
{
    private static void Main()
    {
        var plans = new List<TrainingPlan>
            {
                new TrainingPlan(1, 2),
                new TrainingPlan(1, 3),
                new TrainingPlan(2, 1),
                new TrainingPlan(2, 2)
            };

        var filter = new List<TrainingPlan>
            {
                new TrainingPlan(1, 2),
                new TrainingPlan(1, 3),
            };

        Stopwatch resultTimer1 = new Stopwatch();
        resultTimer1.Start();
        var results = plans.Where(plan => filter.Contains(plan, new TrainingPlanComparer())).ToList();
        resultTimer1.Stop();

        Console.WriteLine("Elapsed Time for filtered result {0}", resultTimer1.Elapsed);

        Console.WriteLine("Result count: {0}",results.Count());

        foreach (var item in results)
        {
            Console.WriteLine("WorkGroup: {0}, Area: {1}",item.WorkgroupId, item.AreaId);
        }

        resultTimer1.Reset();

        resultTimer1.Start();
        var result1 = plans.Where(p => p.AreaId == filter[0].AreaId && p.WorkgroupId == filter[0].WorkgroupId).ToList();
        var result2 = plans.Where(p => p.AreaId == filter[1].AreaId && p.WorkgroupId == filter[1].WorkgroupId).ToList();
        resultTimer1.Stop();

        Console.WriteLine("Elapsed time for single query result: {0}",resultTimer1.Elapsed);//single query is faster

        Console.ReadLine();
    }
}

Upvotes: 1

Related Questions