A more refined version of this LINQ to SQL query

Question

My conundrum is with trying to convert the following T-SQL query into a near equivalent (performance wise) LINQ to SQL query:

SELECT 
    j1.JOB,   
    max(CASE WHEN ISNULL(logs.statcategory, ' ') = 'PREP' THEN 'X' ELSE ' ' END) AS prep,  
    max(CASE WHEN ISNULL(logs.statcategory, ' ') = 'PRINT' THEN 'X' ELSE ' ' END) AS press,  
    max(CASE WHEN ISNULL(logs.statcategory, ' ') = 'BIND' THEN 'X' ELSE ' ' END) AS bind, 
    max(CASE WHEN ISNULL(logs.statcategory, ' ') = 'SHIP' THEN 'X' ELSE ' ' END) AS ship  
from 
    job j1  
left outer join 
( 
    select 
        j.job,
        l.statcategory,
        cnt=count(*)  
    from 
        job j  
    join 
        jobloc jl 
            join location l 
            on 
                l.code = jl.location and 
                l.site = jl.site 
        on j.job = jl.job  
    WHERE
        j.stat = 'O'  
    group by 
        j.job,l.statcategory  
) logs 
on 
    j1.job = logs.job  
WHERE
    j1.stat = 'O'  
group by
    j1.job

This query currently runs just under 0.2 seconds on MS SQL Server. The following LINQ query is what I've come up with that returns the exact same records, but runs nearly 30x slower:

from a0 in Jobs
join a1 in 
(
    from a0 in Jobs
    join a1 in JobLocs on a0.Content equals a1.Job
    join a2 in Locations on new {Code = a1.Location, a1.Site} equals new {a2.Code, a2.Site}
    where a0.Stat == 'O'    
    select new {a0.Content, a2.StatCategory}
) on a0.Content equals a1.Content into a1
from a2 in a1.DefaultIfEmpty()
where a0.Stat == 'O'
group a2 by a0.Content into a0
orderby a0.Key
select new 
{
    Job = a0.Key,
    Prep = (bool?)a0.Max(a1 => a1.StatCategory == "PREP" ? true : false),
    Print = (bool?)a0.Max(a1 => a1.StatCategory == "PRINT" ? true : false),
    BIND = (bool?)a0.Max(a1 => a1.StatCategory == "BIND" ? true : false),
    SHIP = (bool?)a0.Max(a1 => a1.StatCategory == "SHIP" ? true : false),
}

Here is the generated SQL from the LINQ query (using LINQPad):

-- Region Parameters
DECLARE @p0 Int = 79
DECLARE @p1 Int = 79
DECLARE @p2 VarChar(1000) = 'PREP'
DECLARE @p3 VarChar(1000) = 'PRINT'
DECLARE @p4 VarChar(1000) = 'BIND'
DECLARE @p5 VarChar(1000) = 'SHIP'
-- EndRegion
SELECT [t4].[Job], [t4].[value] AS [Prep], [t4].[value2] AS [Print], [t4].[value3] AS [BIND], [t4].[value4] AS [SHIP]
FROM (
    SELECT MAX(
        (CASE 
            WHEN [t3].[StatCategory] = @p2 THEN 1
            WHEN NOT ([t3].[StatCategory] = @p2) THEN 0
            ELSE NULL
         END)) AS [value], MAX(
        (CASE 
            WHEN [t3].[StatCategory] = @p3 THEN 1
            WHEN NOT ([t3].[StatCategory] = @p3) THEN 0
            ELSE NULL
         END)) AS [value2], MAX(
        (CASE 
            WHEN [t3].[StatCategory] = @p4 THEN 1
            WHEN NOT ([t3].[StatCategory] = @p4) THEN 0
            ELSE NULL
         END)) AS [value3], MAX(
        (CASE 
            WHEN [t3].[StatCategory] = @p5 THEN 1
            WHEN NOT ([t3].[StatCategory] = @p5) THEN 0
            ELSE NULL
         END)) AS [value4], [t0].[Job]
    FROM [Job] AS [t0]
    LEFT OUTER JOIN ([Job] AS [t1]
        INNER JOIN [JobLoc] AS [t2] ON [t1].[Job] = [t2].[Job]
        INNER JOIN [Location] AS [t3] ON ([t2].[Location] = [t3].[Code]) AND ([t2].[Site] = [t3].[Site])) ON ([t0].[Job] = [t1].[Job]) AND (UNICODE([t1].[Stat]) = @p0)
    WHERE UNICODE([t0].[Stat]) = @p1
    GROUP BY [t0].[Job]
    ) AS [t4]
ORDER BY [t4].[Job]

One thing that stands out is that the generated SQL from the LINQ query runs the aggregate for each column returned in a subquery, whereas in the original it is part of the outer SELECT. I can imagine part of the performance decrease is there.

I'm (tentatively) willing to accept that there is no better way to write this, and just use the DataContext.ExecuteQuery() method in the LINQ API (and just run and shape the first SQL statement directly). However, I'm trying to not include embedded SQL as much as possible in a project that I'm currently working on, so if it can be made to be near the performance of the original query, that'd be ideal. I've been hacking away at this for some time (partly as an academic exercise, and also to actually use this or similar queries like it), and this is the best I've come up with (I did not write the original query BTW--it was part of an older project that is being migrated to a newer one).

Thanks for any assistance.

Ofir Winegarten · Accepted Answer

As per our discussion in the comments,

The issue is the UNICODE conversion that the linq-to-entities adds from some unknown reason. the DB cannot use the index because of the (unnecessary) conversion.

You can use .Equals instead of == and it will not use UNICODE or change the type to varchar(1) in the db.

A more refined version of this LINQ to SQL query

Answers (1)

Related Questions