How to make left join, group by and select with the single query?

Question

Assume I have the following data:

var workers = new[]
{
    new {  Name = "John",  Id = 1 },
    new {  Name = "Greg",  Id = 2 }, 
    new {  Name = "Jack",  Id = 3 }, 
    new {  Name = "Josh",  Id = 4 },
    new {  Name = "Jill",  Id = 5 },
    new {  Name = "Jane",  Id = 6 }
};

var contracts = new[]
{
    new {  ContractNumber="1",  WorkerId=1, ContractDate = new DateTime(2017,6,30) },
    new {  ContractNumber="2",  WorkerId=2, ContractDate = new DateTime(2017,7,10) },
    new {  ContractNumber="3",  WorkerId=2, ContractDate = new DateTime(2017,7,15) },
    new {  ContractNumber="4",  WorkerId=5, ContractDate = new DateTime(2017,7,20) },
    new {  ContractNumber="5",  WorkerId=1, ContractDate = new DateTime(2017,7,25) }
};

What I need to do is to select the first worker who has the minimum quantity of contracts where contract date greater or equals to:

var fromDate = new DateTime(2017, 7, 1);

excluding the workers with the following Id:

int[] exceptWorkerIds = new int[] {1, 4};

If several workers have a similar minimum quantity of contracts then select the worker with the first name in alphabetical order.

I resolved this task the following way.

Firstly, for each worker left join contracts. If contract exists my helper property ContractExists = 1, if not then 0.

var query = 
from w in workers.Where(x => !exceptWorkerIds.Contains(x.Id))
join c in contracts.Where(x => x.ContractDate >= fromDate) 
    on w.Id equals c.WorkerId into workerContracts
from wc in workerContracts.DefaultIfEmpty()
select new {WorkerId = w.Id, WorkerName = w.Name, ContractExists = wc == null ? 0: 1};

This query gives me the following result:

Secondly, I group the obtained results by WorkerId, WorkerName getting the sum of contracts and order data by sum and worker name:

var result = 
(from q in query
group q.ContractExists by new {q.WorkerId, q.WorkerName} into g
orderby g.Sum(), g.Key.WorkerName
select new 
{
    WorkerId = g.Key.WorkerId, 
    WorkerName = g.Key.WorkerName, 
    WorkerContractsCount = g.Sum()
}).ToList().Take(1);

Take(1) gives me the top 1 of resulted data:

The question: Is there a way to do it with the only query or any simpler or elegant manner then I did? If yes, does this help to boost productivity of query execution?

Ivan Stoev · Accepted Answer

Rather than doing join (which multiplies the data) followed by group by you could use group join (what actually your query is using before you do from wc in workerContracts.DefaultIfEmpty()).

The other logic is pretty much the same - workerContracts.Count() gives you the desired quantity of contracts, so apply the desired order, take the first and you are done:

var result =
    (from w in workers.Where(x => !exceptWorkerIds.Contains(x.Id))
     join c in contracts.Where(x => x.ContractDate >= fromDate)
         on w.Id equals c.WorkerId into workerContracts
     let workerContractsCount = workerContracts.Count()
     orderby workerContractsCount, w.Name
     select new
     {
         WorkerId = w.Id,
         WorkerName = w.Name,
         WorkerContractsCount = workerContractsCount
     })
    .FirstOrDefault();

How to make left join, group by and select with the single query?

Answers (2)

Related Questions