Tom smith
Tom smith

Reputation: 680

Optimize SQL query with pagination

I have a query running against a SQL Server database that is taking over 10 seconds to execute. The table being queried has over 14 million rows.

I want to display the Text column from a Notes table by a given ServiceUserId in date order. There could be thousands of entries so I want to limit the returned values to a manageable level.

SELECT Text
FROM   
    (SELECT    
         ROW_NUMBER() OVER (ORDER BY [DateDone]) AS RowNum, Text
     FROM      
         Notes
     WHERE     
         ServiceUserId = '6D33B91A-1C1D-4C99-998A-4A6B0CC0A6C2') AS RowConstrainedResult
WHERE   
    RowNum >= 40 AND RowNum < 60
ORDER BY 
    RowNum

Below is the execution plan for the above query.

Execution plan

If I run the same query a second time but with different row numbers then I get a response in milliseconds, I assume from a cached execution plan. The query ran for a different ServiceUserId will take ~10 seconds though.

Any suggestions for how to speed up this query?

Upvotes: 0

Views: 6647

Answers (2)

Charlieface
Charlieface

Reputation: 71144

You should look into Keyset Pagination.

It is far more performant than Rowset Pagination.

It differs fundamentally from it, in that instead of referencing a particular block of row numbers, instead you reference starting point to lookup the index key.

The reason it is much faster is that you don't care about how many rows are before a particular key, you just seek a key and move forward (or backward).

Say you are filtering by a single ServiceUserId, ordering by DateDone. You need an index as follows (you could leave out the INCLUDE if it's too big, it doesn't change the maths very much):

create index IX_DateDone on Notes (ServiceUserId, DateDone) INCLUDE (TEXT);

Now, when you select some rows, instead of giving the start and end row numbers, give the starting key:

SELECT TOP (20)
    Text,
    DateDone
FROM
    Notes
WHERE     
    ServiceUserId = '6D33B91A-1C1D-4C99-998A-4A6B0CC0A6C2'
    AND DateDone > @startingDate
ORDER BY 
    DateDone;

On the next run, you pass the last DateDone value you received. This gets you the next batch.

The one small downside is that you cannot jump pages. However, it is much rarer than some may think (from a UI perspective) for a user to want to jump to page 327. So that doesn't really matter.


The key must be unique. If it is not unique you can't seek to exactly the next row. If you need to use an extra column to guarantee uniqueness, it gets a little more complicated:

WITH NotesFiltered AS
(
    SELECT * FROM Notes
    WHERE     
        ServiceUserId = '6D33B91A-1C1D-4C99-998A-4A6B0CC0A6C2'
)
SELECT TOP (20)
    Text,
    DateDone
FROM (
    SELECT
        Text,
        DateDone,
        0 AS ordering
    FROM NotesFiltered 
    WHERE
        DateDone = @startingDate AND NoteId > @startingNoteId
    UNION ALL
    SELECT
        Text,
        DateDone,
        1 AS ordering
    FROM NotesFiltered 
    WHERE
        DateDone > @startingDate
) n
ORDER BY 
    ordering, DateDone, NoteId;

Side Note

In RDBMSs that support row-value comparisons, the multi-column example could be simplified back to the original code by writing:

WHERE (DateDone, NoteId) > (@startingDate, @startingNoteId)

Unfortunately SQL Server does not support this currently.
Please vote for the Azure Feedback request for this

Upvotes: 1

eshirvana
eshirvana

Reputation: 24568

I would suggest to use order by offset fetch : it starts from row no x and fetch z next row, which can be parameterized

SELECT    
   Text
FROM      
   Notes
WHERE     
   ServiceUserId = '6D33B91A-1C1D-4C99-998A-4A6B0CC0A6C2'
Order by DateDone
OFFSET 40 ROWS FETCH NEXT 20 ROWS ONLY

also make sure you have proper index for "DateDone" , maybe include it in the index you already have on "Notes" if you have not yet

you may need to include text column to you index :

create index IX_DateDone on Notes(DateDone) INCLUDE (TEXT,ServiceUserId)

however be noticed that adding such huge column to the index will effect your insert/update efficiency and of course It will need disk space

Upvotes: 0

Related Questions