Darkwing
Darkwing

Reputation: 7585

Server side paging in SQL Server

I am a little confused about how SQL server achieves less reads and performance improvements using common table expressions and ROW_NUMBER. Why doesn't the table actualized in the expression have to perform all the reads which a normal query would have to perform to allow the query to order using ROW_NUMBER?

Upvotes: 2

Views: 4124

Answers (1)

Quassnoi
Quassnoi

Reputation: 425471

The CTE is not (necessarily) "actualized". It's not that it will inevitably copy all rows elsewhere and will perform other operations over the copy (though it may behave so it the optimizer decides it's better).

If we take this simple query:

SELECT  *
FROM    (
        SELECT  *,
                ROW_NUMBER() OVER (ORDER BY id) rn
        FROM    mytable
        ) q
WHERE   rn BETWEEN 101 AND 110

and look at its plan we'll see something like this:

  |--Filter(WHERE:([Expr1003]>=(101) AND [Expr1003]<=(110)))
       |--Top(TOP EXPRESSION:(CASE WHEN (110) IS NULL OR (110)<(0) THEN (0) ELSE (110) END))
            |--Sequence Project(DEFINE:([Expr1003]=row_number))
                 |--Segment
                      |--Clustered Index Scan(OBJECT:([ee].[dbo].[mytable].[PK__mytable__3213E83F29C2D227]), ORDERED FORWARD)

Here, the records are scanned (in id order as the table is clustered on id), assigned the ROW_NUMBER (this is what Sequence Project does) and passed on to TOP which just halts execution when a certain threshold is reached (110 records in our case).

Those 110 records are passed to Filter which only passes the records with rn greater than 100.

The query itself only scans 110 records:

SQL Server parse and compile time: 
   CPU time = 0 ms, elapsed time = 1 ms.

(строк обработано: 10)
Table 'mytable'. Scan count 1, logical reads 3, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 0 ms.

in 3 pages.

Now let's see the unpaginated query:

SELECT  *
FROM    mytable
ORDER BY
        id

This one is pretty simple: read everything from the table and spit it out.

  |--Clustered Index Scan(OBJECT:([ee].[dbo].[mytable].[PK__mytable__3213E83F29C2D227]), ORDERED FORWARD)

However, looking easily does not mean done easily. The table is quite large and we need to do many a read to return all records:

SQL Server parse and compile time: 
   CPU time = 0 ms, elapsed time = 0 ms.

(строк обработано: 1310720)
Table 'mytable'. Scan count 1, logical reads 2765, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

 SQL Server Execution Times:
   CPU time = 266 ms,  elapsed time = 11690 ms.

So, in a nutshell, the pagination query just knows when to stop.

Upvotes: 4

Related Questions