Reputation: 155708
I have this schema:
Lists ( ListId, Name, DateCreated, ... )
ListItems( ListId, Text, Foo, Baz, Qux, ... )
I have an IQueryable<List>
which represents another Linq query which returns some List
entities.
I want to JOIN
it with some aggregate data of ListItems
, but this proving difficult as Linq is generating inefficient SQL - but I also want to make the query composable.
Here is something similar to the SQL I want Linq to generate:
SELECT * FROM ( -- This part represents the IQueryable: SELECT ListId, Name, ... FROM Lists ORDER BY Lists.DateCreated OFFSET 0 ROWS FETCH NEXT 25 ROWS ONLY -- Linq's .Skip(0).Take(25) ) AS ListsResult LEFT JOIN ( -- This is the aggregate data query I want Linq to generate: SELECT ListId, COUNT(1) AS [Count], COUNT( CASE WHEN Foo = 'bar' THEN 1 ELSE NULL END ) AS CountFooBar, COUNT( CASE WHEN Baz > 5 THEN 1 ELSE NULL END ) AS CountBaz5 FROM ListItems WHERE Qux IS NOT NULL GROUP BY ListId ) AS ItemsStats ON ListResults.ListId = ItemsStats.ListId
This is the Linq I have - I prefer the Extension Method syntax:
IQueryable lists = GetLists( 0, 25 ); var stats = this.dbContext.ListItems .Where( (ListItem li) => li.Qux != null ) .GroupBy( (ListItem li) => li.ListId ) .Select( grp => new { grp.Key, Count = grp.Count(), CountFooBar = grp.Count( (ListItem li) => li.Foo == "bar" ) CountBaz5 = grp.Count( (ListItem li) => li.Baz > 5 ) } ) return lists .Join( inner: stats, outerKeySelector: (List l) => l.ListId, innerKeySelector: grp => grp.Key, resultSelector: (list, itemStats) => new { list, itemsStats } )
However this generates SQL looking like this (this query shows my real table and column names, which is a bit more complicated than the schema I posted earlier:)
SELECT [Project13].[C2] AS [C1], [Project13].[ListId] AS [ListId], [Project13].[C1] AS [C2], [Project13].[C3] AS [C3], [Project13].[C4] AS [C4], [Project13].[C5] AS [C5], [Project13].[C6] AS [C6], [Project13].[C7] AS [C7] FROM ( SELECT [Project11].[C1] AS [C1], [Project11].[ListId] AS [ListId], [Project11].[C2] AS [C2], [Project11].[C3] AS [C3], [Project11].[C4] AS [C4], [Project11].[C5] AS [C5], [Project11].[C6] AS [C6], (SELECT COUNT(1) AS [A1] FROM (SELECT [Project12].[ListId] AS [ListId] FROM ( SELECT [Extent11].[ListId] AS [ListId], [Extent11].[Created] AS [Created] FROM [dbo].[Lists] AS [Extent11] WHERE ([Extent11].[TenantId] = 8) AND ([Extent11].[BlarghId] = 8) ) AS [Project12] ORDER BY [Project12].[Created] DESC OFFSET 0 ROWS FETCH NEXT 25 ROWS ONLY ) AS [Limit6] INNER JOIN [dbo].[ListItems] AS [Extent12] ON ([Limit6].[TenantId] = [Extent12].[TenantId]) AND ([Limit6].[BlarghId] = [Extent12].[BlarghId]) AND ([Limit6].[ListId] = [Extent12].[ListId]) WHERE (([Extent12].[Baz] > 0) OR ((LEN([Extent12].[Notes])) > 0) OR ((LEN([Extent12].[Value])) > 0)) AND ([Project11].[TenantId] = [Extent12].[TenantId]) AND ([Project11].[BlarghId] = [Extent12].[BlarghId]) AND ([Project11].[ListId] = [Extent12].[ListId]) AND ([Extent12].[RecommendationRevision] IS NOT NULL)) AS [C7] FROM ( SELECT [Project9].[C1] AS [C1], [Project9].[TenantId] AS [TenantId], [Project9].[BlarghId] AS [BlarghId], [Project9].[ListId] AS [ListId], [Project9].[C2] AS [C2], [Project9].[C3] AS [C3], [Project9].[C4] AS [C4], [Project9].[C5] AS [C5], (SELECT COUNT(1) AS [A1] FROM (SELECT [Project10].[TenantId] AS [TenantId], [Project10].[BlarghId] AS [BlarghId], [Project10].[ListId] AS [ListId] FROM ( SELECT [Extent9].[TenantId] AS [TenantId], [Extent9].[BlarghId] AS [BlarghId], [Extent9].[ListId] AS [ListId], [Extent9].[Created] AS [Created] FROM [dbo].[Lists] AS [Extent9] WHERE ([Extent9].[TenantId] = 8) AND ([Extent9].[BlarghId] = 8) ) AS [Project10] ORDER BY [Project10].[Created] DESC OFFSET 0 ROWS FETCH NEXT 25 ROWS ONLY ) AS [Limit5] INNER JOIN [dbo].[ListItems] AS [Extent10] ON ([Limit5].[TenantId] = [Extent10].[TenantId]) AND ([Limit5].[BlarghId] = [Extent10].[BlarghId]) AND ([Limit5].[ListId] = [Extent10].[ListId]) WHERE (([Extent10].[Baz] > 0) OR ((LEN([Extent10].[Notes])) > 0) OR ((LEN([Extent10].[Value])) > 0)) AND ([Project9].[TenantId] = [Extent10].[TenantId]) AND ([Project9].[BlarghId] = [Extent10].[BlarghId]) AND ([Project9].[ListId] = [Extent10].[ListId]) AND (3 = [Extent10].[Baz])) AS [C6] FROM ( SELECT [Project7].[C1] AS [C1], [Project7].[TenantId] AS [TenantId], [Project7].[BlarghId] AS [BlarghId], [Project7].[ListId] AS [ListId], [Project7].[C2] AS [C2], [Project7].[C3] AS [C3], [Project7].[C4] AS [C4], (SELECT COUNT(1) AS [A1] FROM (SELECT [Project8].[TenantId] AS [TenantId], [Project8].[BlarghId] AS [BlarghId], [Project8].[ListId] AS [ListId] FROM ( SELECT [Extent7].[TenantId] AS [TenantId], [Extent7].[BlarghId] AS [BlarghId], [Extent7].[ListId] AS [ListId], [Extent7].[Created] AS [Created] FROM [dbo].[Lists] AS [Extent7] WHERE ([Extent7].[TenantId] = 8) AND ([Extent7].[BlarghId] = 8) ) AS [Project8] ORDER BY [Project8].[Created] DESC OFFSET 0 ROWS FETCH NEXT 25 ROWS ONLY ) AS [Limit4] INNER JOIN [dbo].[ListItems] AS [Extent8] ON ([Limit4].[TenantId] = [Extent8].[TenantId]) AND ([Limit4].[BlarghId] = [Extent8].[BlarghId]) AND ([Limit4].[ListId] = [Extent8].[ListId]) WHERE (([Extent8].[Baz] > 0) OR ((LEN([Extent8].[Notes])) > 0) OR ((LEN([Extent8].[Value])) > 0)) AND ([Project7].[TenantId] = [Extent8].[TenantId]) AND ([Project7].[BlarghId] = [Extent8].[BlarghId]) AND ([Project7].[ListId] = [Extent8].[ListId]) AND (2 = [Extent8].[Baz])) AS [C5] FROM ( SELECT [Project5].[C1] AS [C1], [Project5].[TenantId] AS [TenantId], [Project5].[BlarghId] AS [BlarghId], [Project5].[ListId] AS [ListId], [Project5].[C2] AS [C2], [Project5].[C3] AS [C3], (SELECT COUNT(1) AS [A1] FROM (SELECT [Project6].[TenantId] AS [TenantId], [Project6].[BlarghId] AS [BlarghId], [Project6].[ListId] AS [ListId] FROM ( SELECT [Extent5].[TenantId] AS [TenantId], [Extent5].[BlarghId] AS [BlarghId], [Extent5].[ListId] AS [ListId], [Extent5].[Created] AS [Created] FROM [dbo].[Lists] AS [Extent5] WHERE ([Extent5].[TenantId] = 8) AND ([Extent5].[BlarghId] = 8) ) AS [Project6] ORDER BY [Project6].[Created] DESC OFFSET 0 ROWS FETCH NEXT 25 ROWS ONLY ) AS [Limit3] INNER JOIN [dbo].[ListItems] AS [Extent6] ON ([Limit3].[TenantId] = [Extent6].[TenantId]) AND ([Limit3].[BlarghId] = [Extent6].[BlarghId]) AND ([Limit3].[ListId] = [Extent6].[ListId]) WHERE (([Extent6].[Baz] > 0) OR ((LEN([Extent6].[Notes])) > 0) OR ((LEN([Extent6].[Value])) > 0)) AND ([Project5].[TenantId] = [Extent6].[TenantId]) AND ([Project5].[BlarghId] = [Extent6].[BlarghId]) AND ([Project5].[ListId] = [Extent6].[ListId]) AND (1 = [Extent6].[Baz])) AS [C4] FROM ( SELECT [Project3].[C1] AS [C1], [Project3].[TenantId] AS [TenantId], [Project3].[BlarghId] AS [BlarghId], [Project3].[ListId] AS [ListId], [Project3].[C2] AS [C2], (SELECT COUNT(1) AS [A1] FROM (SELECT [Project4].[TenantId] AS [TenantId], [Project4].[BlarghId] AS [BlarghId], [Project4].[ListId] AS [ListId] FROM ( SELECT [Extent3].[TenantId] AS [TenantId], [Extent3].[BlarghId] AS [BlarghId], [Extent3].[ListId] AS [ListId], [Extent3].[Created] AS [Created] FROM [dbo].[Lists] AS [Extent3] WHERE ([Extent3].[TenantId] = 8) AND ([Extent3].[BlarghId] = 8) ) AS [Project4] ORDER BY [Project4].[Created] DESC OFFSET 0 ROWS FETCH NEXT 25 ROWS ONLY ) AS [Limit2] INNER JOIN [dbo].[ListItems] AS [Extent4] ON ([Limit2].[TenantId] = [Extent4].[TenantId]) AND ([Limit2].[BlarghId] = [Extent4].[BlarghId]) AND ([Limit2].[ListId] = [Extent4].[ListId]) WHERE (([Extent4].[Baz] > 0) OR ((LEN([Extent4].[Notes])) > 0) OR ((LEN([Extent4].[Value])) > 0)) AND ([Project3].[TenantId] = [Extent4].[TenantId]) AND ([Project3].[BlarghId] = [Extent4].[BlarghId]) AND ([Project3].[ListId] = [Extent4].[ListId]) AND ([Extent4].[Foo] = 1)) AS [C3] FROM ( SELECT [GroupBy1].[A1] AS [C1], [GroupBy1].[K1] AS [TenantId], [GroupBy1].[K2] AS [BlarghId], [GroupBy1].[K3] AS [ListId], [GroupBy1].[K4] AS [C2] FROM ( SELECT [Project2].[K1] AS [K1], [Project2].[K2] AS [K2], [Project2].[K3] AS [K3], [Project2].[K4] AS [K4], COUNT([Project2].[A1]) AS [A1] FROM ( SELECT [Project2].[TenantId] AS [K1], [Project2].[BlarghId] AS [K2], [Project2].[ListId] AS [K3], 1 AS [K4], 1 AS [A1] FROM ( SELECT [Extent2].[ListId] AS [ListId] FROM (SELECT [Project1].[ListId] AS [ListId] FROM ( SELECT [Extent1].[ListId] AS [ListId], [Extent1].[Created] AS [Created] FROM [dbo].[Lists] AS [Extent1] WHERE ([Extent1].[TenantId] = 8) AND ([Extent1].[BlarghId] = 8) ) AS [Project1] ORDER BY [Project1].[Created] DESC OFFSET 0 ROWS FETCH NEXT 25 ROWS ONLY ) AS [Limit1] INNER JOIN [dbo].[ListItems] AS [Extent2] ON (([Limit1].[ListId] = [Extent2].[ListId]) WHERE ([Extent2].[Baz] > 0) OR ((LEN([Extent2].[Notes])) > 0) OR ((LEN([Extent2].[Value])) > 0) ) AS [Project2] ) AS [Project2] GROUP BY [K1], [K2], [K3], [K4] ) AS [GroupBy1] ) AS [Project3] ) AS [Project5] ) AS [Project7] ) AS [Project9] ) AS [Project11] ) AS [Project13]
It doesn't compose the COUNT()
statements together at all, and it moves the COUNT
predicates to separate WHERE
clauses. Also note the repeated paged subqueries (where OFFSET 0 ROW FETCH NEXT 25
is used) whereas my hand-written query only executes it once.
Upvotes: 2
Views: 481
Reputation: 205899
In my experience EF generates such queries when you use aggregate functions that apply filtering like Count(predicate
) in your case. You'll get much better SQL query if you replace Count(condition)
construct with conditional sum (Sum(condition ? 1 : 0)
) like this:
var stats = db.ListItems
.Where(li => li.Qux != null)
.GroupBy(li => li.ListId)
.Select(grp => new
{
grp.Key,
Count = grp.Count(),
CountFooBar = grp.Sum(li => li.Foo == "bar" ? 1 : 0),
CountBaz5 = grp.Sum(li => li.Baz > 5 ? 1 : 0)
});
The other parts stay the same. Before the mod I was getting similar SQL to the posted, and here is what I'm getting after this little mod:
SELECT
[Limit1].[ListId] AS [ListId],
[Limit1].[Name] AS [Name],
[Limit1].[DateCreated] AS [DateCreated],
[GroupBy1].[K1] AS [ListId1],
[GroupBy1].[A1] AS [C1],
[GroupBy1].[A2] AS [C2],
[GroupBy1].[A3] AS [C3]
FROM (SELECT [Extent1].[ListId] AS [ListId], [Extent1].[Name] AS [Name], [Extent1].[DateCreated] AS [DateCreated]
FROM [dbo].[List] AS [Extent1]
ORDER BY [Extent1].[DateCreated] ASC
OFFSET 0 ROWS FETCH NEXT 25 ROWS ONLY ) AS [Limit1]
INNER JOIN (SELECT
[Filter1].[K1] AS [K1],
COUNT([Filter1].[A1]) AS [A1],
SUM([Filter1].[A2]) AS [A2],
SUM([Filter1].[A3]) AS [A3]
FROM ( SELECT
[Extent2].[ListId] AS [K1],
1 AS [A1],
CASE WHEN (N'bar' = [Extent2].[Foo]) THEN 1 ELSE 0 END AS [A2],
CASE WHEN ([Extent2].[Baz] > 5) THEN 1 ELSE 0 END AS [A3]
FROM [dbo].[ListItem] AS [Extent2]
WHERE [Extent2].[Qux] IS NOT NULL
) AS [Filter1]
GROUP BY [K1] ) AS [GroupBy1] ON [Limit1].[ListId] = [GroupBy1].[K1]
UPDATE: The above is just the LINQ equivalent of your manual SQL query (when you adjust the join code to produce LEFT OUTER JOIN
). However, taking into account the master data paging, OUTER APPLY
SQL query might perform better. You can let LINQ to Entities generate such type of query like this:
var listsWithItemsStats = lists
.SelectMany(l => db.ListItems
.Where(li => li.ListId == l.ListId && li.Qux != null)
.GroupBy(li => li.ListId)
.Select(grp => new
{
grp.Key,
Count = grp.Count(),
CountFooBar = grp.Sum(li => li.Foo == "bar" ? 1 : 0),
CountBaz5 = grp.Sum(li => li.Baz > 5 ? 1 : 0)
})
.DefaultIfEmpty(),
(list, itemsStats) => new { list, itemsStats });
Upvotes: 1
Reputation: 155708
Here's a semi-workaround I have:
I realised the best short-term solution is to have the SQL in the database (as a UDF FUNCTION
or VIEW
), this would mean that some data code would therefore have to be in the database (rather than using the DB as a "dumb store").
I first created a table-valued UDF which would accept a table-valued parameter, reasoning that would allow for composition, at the cost of needing to generate the input parameter table (an array of ListId
values from the paged query). However in doing so, I realised that Linq-to-Entities (version 6) does not support table-valued parameters in Function Imports yet.
I then reasoned a better approach would be to move the COUNT
operation to a VIEW
(which represents one half of the LEFT JOIN
I had in my hand-written query), and then I can get Linq to JOIN that against the existing IQueryable
, thus retaining composability and generating an efficient runtime query (indeed, when I run it, the query takes 34ms to execute according to SQL Server Profiler, whereas the old Linq-generated inefficient query took 830ms).
Here's what I used:
CREATE VIEW ListItemStatistics AS SELECT ListId, COUNT(*) AS [CountAll], COUNT( CASE WHEN ... ) AS Count... FROM ListItems WHERE Foo = 'bar' GROUP BY ListId
And then from within Linq:
IQueryable lists = GetListsQuery( 0, 25 ); var listsWithItemsStats = lists. .Join( inner: this.dbContext.ListItemStatistics, outerKeySelector: list => list.ListId, innerKeySelector: row => row.ListId, resultSelector: (list,row) => new { list, row } );
However because this does use database-side logic (in the VIEW
) it is not ideal.
Upvotes: 1