Reputation: 1731
I am measuring differences in query execution and stumbled upon a case I have no explanation for. The query should retrieve 10000 customers with their main address (a customer can have many addresses). We used 2 different methods with Navigation Properties which differ greatly in execution time.
The first method retrieves the customers the way I usually write Linq queries: write the results directly to a business object and calling ToList(). This method takes 25 seconds to execute.
The second method retrieves the customers as a list of EF Entities first. The EF Entities are converted to business objects in a foreach loop. This method takes 2 seconds to execute.
Can someone explain the difference? And is it possible to modify the first method so the execution time is similar to the second?
private List<ICustomer> NavigationProperties_SO(int method)
{
using (Entities context = new Entities())
{
context.Database.Log = s => System.Diagnostics.Debug.WriteLine(s);
context.Configuration.ProxyCreationEnabled = false;
context.Configuration.AutoDetectChangesEnabled = false;
List<ICustomer> customerList = new List<ICustomer>();
if (method == 1)
{
// Execution time: 25 seconds
customerList = (from c in context.cust
.Include(o => o.AddressList)
.Include(o => o.AddressList.Select(p => p.ADDR))
let mainAddress = c.AddressList.Where(o => o.Main_addr == "1").FirstOrDefault()
select new Customer
{
cust = c,
mainAddress = mainAddress,
addr = mainAddress == null ? null : mainAddress.ADDR
}).AsNoTracking().ToList<ICustomer>();
}
else if (method == 2)
{
// Execution time: 2 seconds
var tempList = (from c in context.cust
.Include(o => o.AddressList)
.Include(o => o.AddressList.Select(p => p.ADDR))
select c).AsNoTracking().ToList();
foreach (var c in tempList)
{
ICustomer customer = new Customer();
var mainaddress = c.AddressList.Where(o => o.Main_addr == "1").FirstOrDefault();
customer.cust = c;
customer.mainAddress = mainaddress;
customer.addr = mainaddress == null ? null : mainaddress.ADDR;
customerList.Add(customer);
}
}
return customerList;
}
}
Edit
Here are the (simplified) queries generated by Entity Framework:
Method 1
SELECT
*
FROM [DBA].[CUST] AS [Extent1]
OUTER APPLY (SELECT TOP ( 1 )
*
FROM [DBA].[CUST_ADDR] AS [Extent2]
WHERE (([Extent1].[Id] = [Extent2].[Id]) AND (N'1' = [Extent2].[Main_addr])
ORDER BY 'a' ) AS [Limit1]
LEFT OUTER JOIN [DBA].[ADDR] AS [Extent3] ON [Limit1].[Id] = [Extent3].[Id]
Method 2
SELECT
*
FROM ( SELECT
*
FROM [DBA].[CUST] AS [Extent1]
LEFT OUTER JOIN (SELECT *
FROM [DBA].[CUST_ADDR] AS [Extent2]
LEFT OUTER JOIN [DBA].[ADDR] AS [Extent3] ON [Extent2].[Id] = [Extent3].[Id] ) AS [Join1] ON ([Extent1].[Id] = [Join1].[Id])
) AS [Project1]
The difference is that the first method does the filtering in the query (´let´) while the second method retrieves all records and filters in the loop.
Upvotes: 3
Views: 861
Reputation: 60503
Answer related to comments... (but two long for a comment)
For the "how to choose the best syntax" part
I would say that it comes partially from "experience" (see, 9Rune5 and I suspected the same point, which was the problematic one before seeing the generated sql) : but experience, sometimes, may also leed to wrong conclusions ;)
So to be a little bit more pragmatic, I would suggest you to use tools/libs which will help you to look at the generated sql / time by query, or page...
ANTS Performance profiler, Miniprofiler, Sql Server profiler, etc, it may depend on your technologies / needs...
By the way, if you want to keep a "linq" syntax, you could go for
var tempList = context.cust
.Include(o => o.AddressList)
.Include(o => o.AddressList.Select(p => p.ADDR))
.AsNoTracking()
.ToList();
var result = (from c in tempList
let mainAddress = c.AddressList.Where(o => o.Main_addr == "1").FirstOrDefault()
select new Customer
{
cust = c,
mainAddress = mainAddress,
addr = mainAddress == null ? null : mainAddress.ADDR
}).ToList<ICustomer>();
But not really less verbose than the foreach syntax...
Upvotes: 1
Reputation: 463
I suspect
let mainAddress = c.AddressList.Where(o => o.Main_addr == "1").FirstOrDefault()
is the culprit. Certain queries forces EF to ask for all possible combinations to be returned. EF then spends a little time narrowing down the scope before it provides you with a reasonable result set. You can use SQL Server Profiler to look at the queries generated.
In any case, you can use LINQ, rather than a foreach, at the end of your second method (this won't help performance, but readability might improve):
return tempList.Select(c => new Customer{cust=c, mainAddress = c.AddressList.FirstOrDefault(o=>o.Main_addr=="1"), ...);
Upvotes: 1