Reputation: 2465
Consider the following method:
public IEnumerable<Owner> GetOwners(OwnerParameters ownerParameters)
{
return FindAll()
.OrderBy(on => on.Name)
.Skip((ownerParameters.PageNumber - 1) * ownerParameters.PageSize)
.Take(ownerParameters.PageSize)
.ToList();
}
Where FindAll()
is a repository pattern method that returns IQueryable<Owner>
. Does having .OrderBy()
before .Skip()
and .Take()
methods mean that all the elements from the Owner
data table will be retrieved and ordered, or, does Linq take into account that .Skip()
and .Take()
methods might narrow down the required Owner
elements and only after having retrieved those will the ordering happen?
EDIT: Profiler log:
SELECT XXX
FROM [Owners] AS [a]
ORDER BY [a].[Name]
OFFSET @__p_0 ROWS FETCH NEXT @__p_1 ROWS ONLY',N'@__p_0 int,@__p_1 int',@__p_0=0,@__p_1=10
Upvotes: 4
Views: 1128
Reputation: 582
A quick example to demonstrate that order of query matters. The explanations given above are good.
var test1 = await _dbContext.UserActivityLogs
.Where(x => x.ExternalSyncLogId == request.Id)
.Skip(0).Take(25)
.AsNoTracking().ToListAsync();
This query transpile to
exec sp_executesql N'SELECT [u].[Id], [u].[ActionType], [u].[ActivityEndTime], [u].[ActivityStartTime], [u].[ActivityType], [u].[Created], [u].[CreatedBy], [u].[EntityId], [u].[ExternalSyncLogId], [u].[LastModified], [u].[LastModifiedBy], [u].[RequestBody], [u].[ResponseBody], [u].[Source], [u].[TenantId]
FROM [UserActivityLogs] AS [u]
WHERE ([u].[ExternalSyncLogId] IS NULL
ORDER BY (SELECT 1)
OFFSET @__p_1 ROWS FETCH NEXT @__p_2 ROWS ONLY',N'@__p_1 int,@__p_2 int',@__p_1=0,@__p_2=25
Execution time: 29
But if we just change the sequence then
var test2 = await _dbContext.UserActivityLogs
.Skip(0).Take(25)
.Where(x => x.ExternalSyncLogId == request.Id)
.AsNoTracking().ToListAsync();
Translated as
exec sp_executesql N'SELECT [t].[Id], [t].[ActionType], [t].[ActivityEndTime], [t].[ActivityStartTime], [t].[ActivityType], [t].[Created], [t].[CreatedBy], [t].[EntityId], [t].[ExternalSyncLogId], [t].[LastModified], [t].[LastModifiedBy], [t].[RequestBody], [t].[ResponseBody], [t].[Source], [t].[TenantId]
FROM (
SELECT [u].[Id], [u].[ActionType], [u].[ActivityEndTime], [u].[ActivityStartTime], [u].[ActivityType], [u].[Created], [u].[CreatedBy], [u].[EntityId], [u].[ExternalSyncLogId], [u].[LastModified], [u].[LastModifiedBy], [u].[RequestBody], [u].[ResponseBody], [u].[Source], [u].[TenantId]
FROM [UserActivityLogs] AS [u]
ORDER BY (SELECT 1)
OFFSET @__p_0 ROWS FETCH NEXT @__p_1 ROWS ONLY
) AS [t]
WHERE [t].[ExternalSyncLogId] IS NULL',N'@__p_0 int,@__p_1 int',@__p_0=0,@__p_1=25
Execution time: 474
Upvotes: 0
Reputation: 1062885
Ultimately, this depends on what FindAll()
does and what it returns:
IEnumerable<T>
, then it is using LINQ-to-Objects, which mostly just does literally what it is told; if you sort then pages, then it sorts then pages; if you page then sort, then it pages then sortsIQueryable<T>
, then the query is being composed - and only actually executed in ToList()
, at which point the provider model gets a chance to inspect your query tree and build the most suitable implementation possible, which often means: writing a SQL query that includes an ORDER BY
and some paging hints suitable for the specific RDBMS; if your code paged then sort (which is... unusual) then I would expect most providers to either write some kind of horrible sub-query to try to describe that, or just throw an exception (perhaps NotSupportedException
) in disgustUpvotes: 6
Reputation: 23797
All records wouldn't be retrieved. Depending on your backend a query would be generated, that orders by Name, skips N rows and takes next M rows. Only the query result is retrieved, triggered by .ToList(). ie: In MS SQL server the query might look like:
Select top(M) row_number() over (order by Name) as RowNo, *
from myTable
where RowNo > N
That type of query is not very effective but that is another matter and you could create your custom workaround.
EDIT: I remembered MS SQL code generated wrong. It was:
SELECT ...
FROM ...
ORDER BY ...
OFFSET x ROWS FETCH NEXT y ROWS ONLY
That one is also slow. If you have speed consideration then write your own SQL and send with Linq. Basically what you do is to keep the lastRetrieved value and set it as a parameter:
select top(NTake) ... from ...
order by ...
where orderedByValue > @lastRetrievedValue
(You can send raw SQL queries in Linq)
Upvotes: 0