Sameh
Sameh

Reputation: 1020

Linq to SQL performance with grouping

My question is about Linq to SQL Performance, I have an SQL string and convert it to Linq to sql:

SQL query:

SELECT CONVERT(VARCHAR(10), ClockIn, 103) AS ClockDate, MIN(ClockIn) AS ClockIn, MAX(ClockOut) AS ClockOut, SUM(DATEDIFF(MINUTE, ClockIn, ClockOut)) AS [TotalTime]
FROM TimeLog
WHERE (EmployeeId = 10)
GROUP BY CONVERT(VARCHAR(10), ClockIn, 103)
ORDER BY ClockIn DESC

LINQ query:

From u In objDC.TimeLogs
Where u.EmployeeId = 10
Group By Key = New With {u.ClockIn.Year, u.ClockIn.Month, u.ClockIn.Day} Into G = Group
Order By G.First.ClockIn Descending
Select New With {.ClockDate = Key.Day & "/" & Key.Month & "/" & Key.Year,
 .ClockIn = G.Min(Function(p) p.ClockIn),
 .ClockOut = G.Max(Function(p) p.ClockOut),
 .TotalTime = G.Sum(Function(p) SqlMethods.DateDiffMinute(p.ClockIn, p.ClockOut))}

The generated query string from the LINQ in SQL profiler was:

SELECT [t4].[value] AS [ClockDate], [t4].[value2] AS [ClockIn2], [t4].[value22] AS [ClockOut], [t4].[value3] AS [TotalTime]
 FROM (
 SELECT ((((CONVERT(NVarChar,[t3].[value32])) + '/') + (CONVERT(NVarChar,[t3].[value222]))) + '/') + (CONVERT(NVarChar,[t3].[value22])) AS [value], [t3].[value] AS [value2], [t3].[value2] AS [value22], [t3].[value3], [t3].[value22] AS [value222], [t3].[value222] AS [value2222], [t3].[value32]
 FROM (
 SELECT MIN([t2].[ClockIn]) AS [value], MAX([t2].[ClockOut]) AS [value2], SUM([t2].[value]) AS [value3], [t2].[value2] AS [value22], [t2].[value22] AS [value222], [t2].[value3] AS [value32]
 FROM (
 SELECT DATEDIFF(Minute, [t1].[ClockIn], [t1].[ClockOut]) AS [value], [t1].[EmployeeId], [t1].[value] AS [value2], [t1].[value2] AS [value22], [t1].[value3], [t1].[ClockIn], [t1].[ClockOut]
 FROM (
 SELECT DATEPART(Year, [t0].[ClockIn]) AS [value], DATEPART(Month, [t0].[ClockIn]) AS [value2], DATEPART(Day, [t0].[ClockIn]) AS [value3], [t0].[ClockIn], [t0].[ClockOut], [t0].[EmployeeId]
 FROM [dbo].[TimeLog] AS [t0]
 ) AS [t1]
 ) AS [t2]
 WHERE [t2].[EmployeeId] = 10
 GROUP BY [t2].[value2], [t2].[value22], [t2].[value3]
 ) AS [t3]
 ) AS [t4]
 ORDER BY (
 SELECT [t6].[ClockIn]
 FROM (
 SELECT TOP (1) [t5].[ClockIn]
 FROM [dbo].[TimeLog] AS [t5]
 WHERE ((([t4].[value222] IS NULL) AND (DATEPART(Year, [t5].[ClockIn]) IS NULL)) OR (([t4].[value222] IS NOT NULL) AND (DATEPART(Year, [t5].[ClockIn]) IS NOT NULL) AND ((([t4].[value222] IS NULL) AND (DATEPART(Year, [t5].[ClockIn]) IS NULL)) OR (([t4].[value222] IS NOT NULL) AND (DATEPART(Year, [t5].[ClockIn]) IS NOT NULL) AND ([t4].[value222] = DATEPART(Year, [t5].[ClockIn])))))) AND ((([t4].[value2222] IS NULL) AND (DATEPART(Month, [t5].[ClockIn]) IS NULL)) OR (([t4].[value2222] IS NOT NULL) AND (DATEPART(Month, [t5].[ClockIn]) IS NOT NULL) AND ((([t4].[value2222] IS NULL) AND (DATEPART(Month, [t5].[ClockIn]) IS NULL)) OR (([t4].[value2222] IS NOT NULL) AND (DATEPART(Month, [t5].[ClockIn]) IS NOT NULL) AND ([t4].[value2222] = DATEPART(Month, [t5].[ClockIn])))))) AND ((([t4].[value32] IS NULL) AND (DATEPART(Day, [t5].[ClockIn]) IS NULL)) OR (([t4].[value32] IS NOT NULL) AND (DATEPART(Day, [t5].[ClockIn]) IS NOT NULL) AND ((([t4].[value32] IS NULL) AND (DATEPART(Day, [t5].[ClockIn]) IS NULL)) OR (([t4].
 [value32] IS NOT NULL) AND (DATEPART(Day, [t5].[ClockIn]) IS NOT NULL) AND ([t4].[value32] = DATEPART(Day, [t5].[ClockIn])))))) AND ([t5].[EmployeeId] = 10)
 ) AS [t6]
 ) DESC

The LINQ to SQL was too slow, and the execution plan for the generated query compared with the SQL Query was 7% for the human written SQL query and 97% for the Linq generated query.

What's wrong with my Linq to SQL query? or is it a Linq performance and limitation?

Upvotes: 3

Views: 3022

Answers (2)

Sameh
Sameh

Reputation: 1020

Again, this is the linq query based on Guillaume recommendation.

Many thanks Guillaume it solved the problem, I agree with you the problem was related to G.First.

I changed my Linq query according to your answer to:

From u In objDC.TimeLogs
Where u.EmployeeId = 10
Group By key = New With {u.ClockIn.Date} Into G = Group
Order By key.Date Descending
Select New With {
    .ClockDate = key.Date,
    .ClockIn = G.Min(Function(p) p.ClockIn),
    .ClockOut = G.Max(Function(p) p.ClockOut),
    .TotalTime = G.Sum(Function(p) SqlMethods.DateDiffMinute(p.ClockIn, p.ClockOut)) / 60}

I got the same result but the query was much faster, and the profiler gave me 55% for written query and 45% for the newly generated query, it was even faster from the original string query.

Many thanks for your help.

Upvotes: 0

Guillaume86
Guillaume86

Reputation: 14400

I think the problem is that you access the rows of each group in your OrderBy G.First statement and triggering a N+1 behavior in Linq-to-SQL, can you try something like:

var query = objDC.TimeLogs
            .Where(c => c.EmployeeId == 10)
            .GroupBy(c => c.ClockIn.Date)
            .OrderBy(g => g.Key)
            .Select(g => new
            {
                Date = g.Key,
                ClockIn = g.Min(c => c.ClockIn),
                ClockOut = g.Max(c => c.ClockOut),
            })
            .Select(g => new 
            {
                g.Date,
                g.ClockIn,
                g.ClockOut,
                TotalTime = g.ClockOut - g.ClockIn
            });

Upvotes: 4

Related Questions