Reputation: 457
I am working on a commission tracking system. Business case requires many-to-many relationships between sales staffers and customers. The (simplified) entities are:
public class Customer {
public int Id { get; set; }
public string Name { get; set; }
public virtual ICollection<CustomerSeller> CustomerSellers { get; set; }
public virtual ICollection<Payment> Payments { get; set; }
}
public class Seller {
public int Id { get; set; }
public string Name { get; set; }
public virtual ICollection<CustomerSeller> CustomerSellers { get; set; }
}
// join table w/payload for many-to-many relationship
public class CustomerSeller {
public int Id { get; set; }
public Seller Seller { get; set; }
public Customer Customer { get; set; }
public decimal CommissionRate { get; set; }
}
public class Payment {
public int Id { get; set; }
public Customer ReceivedFrom { get; set; }
public DateTime Date { get; set; }
public decimal Amount { get; set; }
}
My goal for the moment is to get a list of all payments to customers that are linked to a given sales person. If I was writing straight SQL it would look something like this:
select Payment.*
from Payment
inner join CustomerSeller on CustomerSeller.Customer_Id = Payment.ReceivedFrom_Id
where CustomerSeller.Seller_Id = @sellerIdToQuery
I'm trying this with linq/EF on my ASP NET MVC site with this code:
public ActionResult Sales(int id) {
var qry = (
from p in db.Payments
join cs in db.CustomerSellers on p.ReceivedFrom equals cs.Customer
where cs.Seller.Id == id
select p);
var paymentList = qry.ToList();
return View(paymentList);
}
This DOES work, but the behind-the-scenes SQL looks waaay complicated (it's beyond my SQL-decoding abilities to analyze):
{SELECT
[Extent1].[Id] AS [Id],
[Extent1].[Date] AS [Date],
[Extent1].[Amount] AS [Amount],
[Extent1].[ReceivedFrom_Id] AS [ReceivedFrom_Id]
FROM [dbo].[Payment] AS [Extent1]
INNER JOIN [dbo].[CustomerSeller] AS [Extent2] ON EXISTS (SELECT
1 AS [C1]
FROM ( SELECT 1 AS X ) AS [SingleRowTable1]
LEFT OUTER JOIN (SELECT
[Extent3].[Id] AS [Id]
FROM [dbo].[Customer] AS [Extent3]
WHERE [Extent1].[ReceivedFrom_Id] = [Extent3].[Id] ) AS [Project1] ON 1 = 1
LEFT OUTER JOIN (SELECT
[Extent4].[Id] AS [Id]
FROM [dbo].[Customer] AS [Extent4]
WHERE [Extent2].[Customer_Id] = [Extent4].[Id] ) AS [Project2] ON 1 = 1
LEFT OUTER JOIN (SELECT
[Extent5].[Id] AS [Id]
FROM [dbo].[Customer] AS [Extent5]
WHERE [Extent1].[ReceivedFrom_Id] = [Extent5].[Id] ) AS [Project3] ON 1 = 1
LEFT OUTER JOIN (SELECT
[Extent6].[Id] AS [Id]
FROM [dbo].[Customer] AS [Extent6]
WHERE [Extent2].[Customer_Id] = [Extent6].[Id] ) AS [Project4] ON 1 = 1
WHERE ([Project1].[Id] = [Project2].[Id]) OR (([Project3].[Id] IS NULL) AND ([Project4].[Id] IS NULL))
)
WHERE [Extent2].[Seller_Id] = @p__linq__0}
I'm particularly interested if my linq query (I'm new) is badly formed... how should it be written? Or is there a problem with my entities' structure? Or is the SQL just fine, and going to run efficiently on large amounts of data in a production environment?
PROGRESS:
First, Slauma suggested two new queries. In experimenting with them I noticed the SQL was putting extra effort into handling NULLs on the join table CustomerSeller for Customer and Seller fields. This was an error on my part since these fields should be non-nullable. I set them to [Required] in the Entity:
public class CustomerSeller {
public int Id { get; set; }
[Required]
public Seller Seller { get; set; }
[Required]
public Customer Customer { get; set; }
public decimal CommissionRate { get; set; }
}
The first query Slauma wrote:
var qry = from p in db.Payments
where p.ReceivedFrom.CustomerSellers.Any(cs => cs.Seller.Id == id)
select p;
Has vastly superior SQL generated:
SELECT
[Extent1].[Id] AS [Id],
[Extent1].[Date] AS [Date],
[Extent1].[Amount] AS [Amount],
[Extent1].[ReceivedFrom_Id] AS [ReceivedFrom_Id]
FROM [dbo].[Payment] AS [Extent1]
WHERE EXISTS (SELECT
1 AS [C1]
FROM [dbo].[CustomerSeller] AS [Extent2]
WHERE ([Extent1].[ReceivedFrom_Id] = [Extent2].[Customer_Id]) AND ([Extent2].[Seller_Id] = @p__linq__0)
)
The next tweak Slauma recommended was to join my original query on the Id fields rather than the whole Entities:
var qry = from p in db.Payments
//old: join cs in db.CustomerSellers on p.ReceivedFrom equals cs.Customer
//new:
join cs in db.CustomerSellers on p.ReceivedFrom.Id equals cs.Customer.Id
where cs.Seller.Id == id
select p;
This generates a very high quality SQL statement:
SELECT
[Extent1].[Id] AS [Id],
[Extent1].[Date] AS [Date],
[Extent1].[Amount] AS [Amount],
[Extent1].[ReceivedFrom_Id] AS [ReceivedFrom_Id]
FROM [dbo].[Payment] AS [Extent1]
INNER JOIN [dbo].[CustomerSeller] AS [Extent2] ON [Extent1].[ReceivedFrom_Id] = [Extent2].[Customer_Id]
WHERE [Extent2].[Seller_Id] = @p__linq__0
Essentially what I would have written directly in SQL myself.
Takeaways:
(1) Join on keys rather than full entities.
(2) If the join table can't have nulls for the many-to-many mapping fields, mark them [Required] which simplifies the queries a bit.
(3) Check the behind-the-scenes SQL for your linq queries, especially when frequently used, or when they may touch a lot of data. There may be monsters hiding.
(4) Slauma is a gentleman and a scholar. :-)
Upvotes: 2
Views: 613
Reputation: 177133
I would write the query this way:
var qry = from p in db.Payments
where p.ReceivedFrom.CustomerSellers.Any(cs => cs.Seller.Id == id)
select p;
var paymentList = qry.ToList();
Or completely with extension methods:
var qry = db.Payments
.Where(p => p.ReceivedFrom.CustomerSellers.Any(cs => cs.Seller.Id == id));
var paymentList = qry.ToList();
I don't know if or how much the SQL will be different from your query.
Edit
Alternative:
var qry = db.CustomerSellers
.Where(cs => cs.Seller.Id == id)
.SelectMany(cs => cs.Customer.Payments);
var paymentList = qry.ToList();
If Seller
and Customer
in CustomerSellers
have a composite unique constraint the resulting Payment
s should have no duplicates. Otherwise you need to append Distinct()
after SelectMany
.
Upvotes: 1