Mark Miller
Mark Miller

Reputation: 7447

Efficiently paging large data sets with LINQ

When looking into the best ways to implement paging in C# (using LINQ), most suggestions are something along these lines:

// Execute the query
var query = db.Entity.Where(e => e.Something == something);

// Get the total num records
var total = query.Count();

// Page the results
var paged = query.Skip((pageNum - 1) * pageSize).Take(pageSize);

This seems to be the commonly suggested strategy (simplified).

For me, my main purpose in paging is for efficiency. If my table contains 1.2 million records where Something == something, I don't want to retrieve all of them at the same time. Instead, I want to page the data, grabbing as few records as possible. But with this method, it seems that this is a moot point.

If I understand it correctly, the first statement is still retrieving the 1.2 million records, then it is being paged as necessary.

Does paging in this way actually improve performance? If the 1.2 million records are going to be retrieved every time, what's the point (besides the obvious UI benefits)?

Am I misunderstanding this? Any .NET gurus out there that can give me a lesson on LINQ, paging, and performance (when dealing with large data sets)?

Upvotes: 3

Views: 4546

Answers (3)

Diego
Diego

Reputation: 20194

The first statement does not execute the actual SQL query, it only builds part of the query you intend to run.

It is when you call query.Count() that the first will be executed

SELECT COUNT(*) FROM Table WHERE Something = something

On query.Skip().Take() won't execute the query either, it is only when you try to enumerate the results(doing a foreach over paged or calling .ToList() on it) that it will execute the appropriate SQL statement retrieving only the rows for the page (using ROW_NUMBER).

If watch this in the SQL Profiler you will see that exactly two queries are executed and at no point it will try to retrieve the full table.


Be careful when you are using the debugger, because if you step after the first statement and try to look at the contents of query that will execute the SQL query. Maybe that is the source of your misunderstanding.

Upvotes: 8

Huy Hoang Pham
Huy Hoang Pham

Reputation: 4147

// Execute the query
var query = db.Entity.Where(e => e.Something == something);

For your information, nothing is called after the first statement.

// Get the total num records
var total = query.Count();

This count query will be translated to SQL, and it'll make a call to database. This call will not get all records, because the generated SQL is something like this:

SELECT COUNT(*) FROM Entity where Something LIKE 'something'

For the last query, it doesn't get all the records neither. The query will be translated into SQL, and the paging run in the database.

Maybe you'll find this question useful: efficient way to implement paging

Upvotes: 3

dshockey
dshockey

Reputation: 185

I believe Entity Framework might structure the SQL query with the appropriate conditions based on the linq statements. (e.g. using ROWNUMBER() OVER ...).

I could be wrong on that, however. I'd run SQL profiler and see what the generated query looks like.

Upvotes: 2

Related Questions