Reputation: 12704
Suppose you have an Author
domain class, and a Book
domain class and each Author
can have 0 or more Books
(a one-to-many relationship).
If you do the following:
var dtos = dbContext.Authors.Select(a => new
{
Name = a.Name,
BookNames = a.Books.Select(b => b.Name).ToList()
}).ToList();
You expect the following SQL to get generated since you only requested the names of the authors, and the names of the books of each author:
SELECT [a].[Name], [b].[Name]
FROM [Authors] AS [a]
LEFT JOIN [Books] AS [b] ON [a].[Id] = [b].[AuthorId]
ORDER BY [a].[Id], [b].[Id]
However, instead, Entity Framework causes the following SQL to be generated:
SELECT [a].[Name], [a].[Id], [b].[Name], [b].[Id]
FROM [Authors] AS [a]
LEFT JOIN [Books] AS [b] ON [a].[Id] = [b].[AuthorId]
ORDER BY [a].[Id], [b].[Id]
which apparently is unnecessarily retrieving the IDs of all authors and the IDs of all their books.
This is while this doesn't happen when you do the joining from the other side, that is, from books to authors. For example, when you request the names of the books, and the names of their authors, like:
var dtos = dbContext.Books.Select(b => new
{
Name = b.Name,
BookNames = b.Author == null ? null : b.Author.Name
}).ToList();
The generated SQL for the above code would be:
SELECT [b].[Name], CASE
WHEN [a].[Id] IS NULL THEN CAST(1 AS bit)
ELSE CAST(0 AS bit)
END, [a].[Name]
FROM [Books] AS [b]
LEFT JOIN [Authors] AS [a] ON [b].[AuthorId] = [a].[Id]
which is not retrieving the IDs, as expected.
Why is this the case? Why does EF Core retrieve the IDs even though I didn't include them in my projection?
I have tested this with EF Core 3.x + 5.0 RC1. If that matters.
Upvotes: 4
Views: 208
Reputation: 131324
Ivan Stoev explained why the first query returns the IDs. The second query is very different. While the first query returns one root object per author containing all the author's books, the second returns a flat list of objects. There's no need to know the IDs in this case. The equivalent starting from the Authors
would be
var bookNames= from a in dbContext.Authors
from b in a.Books
select new {a.Name,BookNames=new {b.Name}};
Upvotes: 1
Reputation: 205589
The SQL query result set is flat. The one you expect would have been generated (and indeed is generated when you doing it from other side) if your query was flat as well, like this
var query =
from a in dbContext.Authors
from b in a.Books.DefaultIfEmpty()
select new { Name = a.Name, BookName = b.Name };
However your query result shape is different - name + correlated list of book names. In general SQL cannot provide such shape, so EF Core should somehow convert (group) the returned flat result set client side. The records come ordered by the two ids, but these two ids are also needed for correct grouping. Well, technically speaking, all ids except the last in order, so [b].[Id]
in this case is redundant. So they can group them sequentially (without buffering the whole result set in memory like the regular LINQ to Objects GroupBy
) with something like this (in pseudo-code):
var enumerator = query.GetEnumerator();
bool hasNext = enumerator.MoveNext();
while (hasNext)
{
var aId = enumerator.Current["AId"]; // <-- here is where it is needed
var aName = enumerator.Current["AName"];
var bNames = new List<string>();
do
{
bNames.Add(enumerator.Current["BName"]);
hasNext = enumerator.MoveNext();
}
while (hasNext && aId == enumerator.Current["AId"]); // <-- as well here
// ready to produce record for the consumer
yield return new { Name = aName, BookNames = bNames };
}
Upvotes: 7