Bruno V
Bruno V

Reputation: 1731

EF Creating business objects in Linq or in foreach

I am measuring differences in query execution and stumbled upon a case I have no explanation for. The query should retrieve 10000 customers with their main address (a customer can have many addresses). We used 2 different methods with Navigation Properties which differ greatly in execution time.

The first method retrieves the customers the way I usually write Linq queries: write the results directly to a business object and calling ToList(). This method takes 25 seconds to execute.

The second method retrieves the customers as a list of EF Entities first. The EF Entities are converted to business objects in a foreach loop. This method takes 2 seconds to execute.

Can someone explain the difference? And is it possible to modify the first method so the execution time is similar to the second?

private List<ICustomer> NavigationProperties_SO(int method)
{
   using (Entities context = new Entities())
   {
      context.Database.Log = s => System.Diagnostics.Debug.WriteLine(s);

      context.Configuration.ProxyCreationEnabled = false;
      context.Configuration.AutoDetectChangesEnabled = false;

      List<ICustomer> customerList = new List<ICustomer>();
      if (method == 1)
      {
         // Execution time: 25 seconds
         customerList = (from c in context.cust
                                           .Include(o => o.AddressList)
                                           .Include(o => o.AddressList.Select(p => p.ADDR))
                             let mainAddress = c.AddressList.Where(o => o.Main_addr == "1").FirstOrDefault()
                             select new Customer
                             {
                                cust = c,
                                mainAddress = mainAddress,
                                addr = mainAddress == null ? null : mainAddress.ADDR
                             }).AsNoTracking().ToList<ICustomer>();
      }
      else if (method == 2)
      {
         // Execution time: 2 seconds
         var tempList = (from c in context.cust
                                           .Include(o => o.AddressList)
                                           .Include(o => o.AddressList.Select(p => p.ADDR))
                         select c).AsNoTracking().ToList();
         foreach (var c in tempList)
         {
            ICustomer customer = new Customer();
            var mainaddress = c.AddressList.Where(o => o.Main_addr == "1").FirstOrDefault();
            customer.cust = c;
            customer.mainAddress = mainaddress;
            customer.addr = mainaddress == null ? null : mainaddress.ADDR;
            customerList.Add(customer);
         }
      }
      return customerList;
   }
}

Edit

Here are the (simplified) queries generated by Entity Framework:

Method 1

SELECT 
*
FROM   [DBA].[CUST] AS [Extent1]
OUTER APPLY  (SELECT TOP ( 1 ) 
    *
    FROM [DBA].[CUST_ADDR] AS [Extent2]
    WHERE (([Extent1].[Id] = [Extent2].[Id]) AND (N'1' = [Extent2].[Main_addr])
    ORDER BY 'a' ) AS [Limit1]
LEFT OUTER JOIN [DBA].[ADDR] AS [Extent3] ON [Limit1].[Id] = [Extent3].[Id]

Method 2

SELECT 
*
FROM ( SELECT 
    *
    FROM  [DBA].[CUST] AS [Extent1]
    LEFT OUTER JOIN  (SELECT *
        FROM  [DBA].[CUST_ADDR] AS [Extent2]
        LEFT OUTER JOIN [DBA].[ADDR] AS [Extent3] ON [Extent2].[Id] = [Extent3].[Id] ) AS [Join1] ON ([Extent1].[Id] = [Join1].[Id])
)  AS [Project1]

The difference is that the first method does the filtering in the query (´let´) while the second method retrieves all records and filters in the loop.

Upvotes: 3

Views: 861

Answers (2)

Rapha&#235;l Althaus
Rapha&#235;l Althaus

Reputation: 60503

Answer related to comments... (but two long for a comment)

For the "how to choose the best syntax" part

I would say that it comes partially from "experience" (see, 9Rune5 and I suspected the same point, which was the problematic one before seeing the generated sql) : but experience, sometimes, may also leed to wrong conclusions ;)

So to be a little bit more pragmatic, I would suggest you to use tools/libs which will help you to look at the generated sql / time by query, or page...

ANTS Performance profiler, Miniprofiler, Sql Server profiler, etc, it may depend on your technologies / needs...

By the way, if you want to keep a "linq" syntax, you could go for

var tempList = context.cust
                      .Include(o => o.AddressList)
                      .Include(o => o.AddressList.Select(p => p.ADDR))
                      .AsNoTracking()
                      .ToList();

var result = (from c in tempList
             let mainAddress = c.AddressList.Where(o => o.Main_addr == "1").FirstOrDefault()
             select new Customer
                    {
                         cust = c,
                         mainAddress = mainAddress,
                         addr = mainAddress == null ? null : mainAddress.ADDR
                     }).ToList<ICustomer>();

But not really less verbose than the foreach syntax...

Upvotes: 1

9Rune5
9Rune5

Reputation: 463

I suspect

let mainAddress = c.AddressList.Where(o => o.Main_addr == "1").FirstOrDefault()

is the culprit. Certain queries forces EF to ask for all possible combinations to be returned. EF then spends a little time narrowing down the scope before it provides you with a reasonable result set. You can use SQL Server Profiler to look at the queries generated.

In any case, you can use LINQ, rather than a foreach, at the end of your second method (this won't help performance, but readability might improve):

return tempList.Select(c => new Customer{cust=c, mainAddress = c.AddressList.FirstOrDefault(o=>o.Main_addr=="1"), ...);

Upvotes: 1

Related Questions