Reputation: 2465

C# Linq query execution order

Consider the following method:

public IEnumerable<Owner> GetOwners(OwnerParameters ownerParameters)
{
    return FindAll()
        .OrderBy(on => on.Name)
        .Skip((ownerParameters.PageNumber - 1) * ownerParameters.PageSize)
        .Take(ownerParameters.PageSize)
        .ToList();
}

Where FindAll() is a repository pattern method that returns IQueryable<Owner>. Does having .OrderBy() before .Skip() and .Take() methods mean that all the elements from the Owner data table will be retrieved and ordered, or, does Linq take into account that .Skip() and .Take() methods might narrow down the required Owner elements and only after having retrieved those will the ordering happen?

EDIT: Profiler log:

SELECT XXX
FROM [Owners] AS [a]
ORDER BY [a].[Name]
OFFSET @__p_0 ROWS FETCH NEXT @__p_1 ROWS ONLY',N'@__p_0 int,@__p_1 int',@__p_0=0,@__p_1=10

Upvotes: 4

Answers (3)

Hamza Iftikhar

Reputation: 582

A quick example to demonstrate that order of query matters. The explanations given above are good.

var test1 = await _dbContext.UserActivityLogs
.Where(x => x.ExternalSyncLogId == request.Id)
.Skip(0).Take(25)
.AsNoTracking().ToListAsync();

This query transpile to

exec sp_executesql N'SELECT [u].[Id], [u].[ActionType], [u].[ActivityEndTime], [u].[ActivityStartTime], [u].[ActivityType], [u].[Created], [u].[CreatedBy], [u].[EntityId], [u].[ExternalSyncLogId], [u].[LastModified], [u].[LastModifiedBy], [u].[RequestBody], [u].[ResponseBody], [u].[Source], [u].[TenantId]
FROM [UserActivityLogs] AS [u]
WHERE ([u].[ExternalSyncLogId] IS NULL
ORDER BY (SELECT 1)
OFFSET @__p_1 ROWS FETCH NEXT @__p_2 ROWS ONLY',N'@__p_1 int,@__p_2 int',@__p_1=0,@__p_2=25

Execution time: 29

But if we just change the sequence then

var test2 = await _dbContext.UserActivityLogs
.Skip(0).Take(25)
.Where(x => x.ExternalSyncLogId == request.Id)
.AsNoTracking().ToListAsync();

Translated as

exec sp_executesql N'SELECT [t].[Id], [t].[ActionType], [t].[ActivityEndTime], [t].[ActivityStartTime], [t].[ActivityType], [t].[Created], [t].[CreatedBy], [t].[EntityId], [t].[ExternalSyncLogId], [t].[LastModified], [t].[LastModifiedBy], [t].[RequestBody], [t].[ResponseBody], [t].[Source], [t].[TenantId]
FROM (
    SELECT [u].[Id], [u].[ActionType], [u].[ActivityEndTime], [u].[ActivityStartTime], [u].[ActivityType], [u].[Created], [u].[CreatedBy], [u].[EntityId], [u].[ExternalSyncLogId], [u].[LastModified], [u].[LastModifiedBy], [u].[RequestBody], [u].[ResponseBody], [u].[Source], [u].[TenantId]
    FROM [UserActivityLogs] AS [u]
    ORDER BY (SELECT 1)
    OFFSET @__p_0 ROWS FETCH NEXT @__p_1 ROWS ONLY
) AS [t]
WHERE [t].[ExternalSyncLogId] IS NULL',N'@__p_0 int,@__p_1 int',@__p_0=0,@__p_1=25

Execution time: 474

Upvotes: 0

Marc Gravell

Reputation: 1062885

Ultimately, this depends on what FindAll() does and what it returns:

if it returns IEnumerable<T>, then it is using LINQ-to-Objects, which mostly just does literally what it is told; if you sort then pages, then it sorts then pages; if you page then sort, then it pages then sorts
however, if it returns IQueryable<T>, then the query is being composed - and only actually executed in ToList(), at which point the provider model gets a chance to inspect your query tree and build the most suitable implementation possible, which often means: writing a SQL query that includes an ORDER BY and some paging hints suitable for the specific RDBMS; if your code paged then sort (which is... unusual) then I would expect most providers to either write some kind of horrible sub-query to try to describe that, or just throw an exception (perhaps NotSupportedException) in disgust

Upvotes: 6

Cetin Basoz

Reputation: 23797

All records wouldn't be retrieved. Depending on your backend a query would be generated, that orders by Name, skips N rows and takes next M rows. Only the query result is retrieved, triggered by .ToList(). ie: In MS SQL server the query might look like:

Select top(M) row_number() over (order by Name) as RowNo, * 
from myTable
where RowNo > N

That type of query is not very effective but that is another matter and you could create your custom workaround.

EDIT: I remembered MS SQL code generated wrong. It was:

SELECT ...
FROM ...
ORDER BY ...
OFFSET x ROWS FETCH NEXT y ROWS ONLY

That one is also slow. If you have speed consideration then write your own SQL and send with Linq. Basically what you do is to keep the lastRetrieved value and set it as a parameter:

select top(NTake) ... from ... 
order by ... 
where orderedByValue > @lastRetrievedValue

(You can send raw SQL queries in Linq)

Upvotes: 0

C# Linq query execution order

Answers (3)

Related Questions