ikabott
ikabott

Reputation: 39

Poor performance on pagination using SQL Server ROW_NUMBER()

I am trying to build a pagination mechanism. I am using a ORM that creates SQL looking like this:

SELECT * FROM 
    (SELECT t1.colX, t2.colY
         ROW_NUMBER() OVER (ORDER BY t1.col3) AS row 
     FROM Table1 t1
     INNER JOIN Table2 t2
     ON t1.col1=t2.col2
    )a 
WHERE row >= n AND row <= m

Table1 has >500k rows and Table2 has >10k records

I execute the queries directly in the SQL Server 2008 R2 Management Studio. The subquery takes 2-3sec to execute but the whole query takes > 2 min.

I know SQL Server 2012 accepts the OFFSET .. LIMIT .. option but I cannot upgrade the software.

Can anyone help me in improving the performance of the query or suggest other pagination mechanism that can be imposed through the ORM software.

Update:

Testing Roman Pekar's solution (see comments on the solution) proved that ROW_NUMBER() might not be the cause of the performance problems. Unfortunately the problems persist.

Thanks

Upvotes: 2

Views: 4612

Answers (3)

Mikael Eriksson
Mikael Eriksson

Reputation: 139010

As I understand your table structure from comments.

create table Table2
(
  col2 int identity primary key,
  colY int
)

create table Table1
(
  col3 int identity primary key,
  col1 int not null references Table2(col2),
  colX int
)

That means that the rows returned from Table1 can never be filtered by the join to Table2 because Table1.col1 is not null. Neither can the join to Table2 add rows to the result since Table2.Col2 is the primary key.

You can then rewrite your query to generate row numbers on Table1 before the join to Table2. And the where clause is also applied before the join to Table2 meaning that you will only locate the rows in Table2 that is actually part of the result set.

select T1.colX,
       T2.colY,
       T1.row
from
    (
    select col1,
           colX,
           row_number() over(order by col3) as row
    from Table1
    ) as T1
  inner join Table2 as T2
    on T1.col1 = T2.col2
where row >= @n and row <= @m

SQL Fiddle

I have no idea if you can make your ORM (Lightspeed by Mindscape) to generated the paging query like this instead of what you have now.

The query plan from this answer: enter image description here

The query plan using the query in the question: enter image description here

There is a huge difference in reads between the two.

enter image description here

Upvotes: 3

roman
roman

Reputation: 117636

I suggest you to check indexes on your tables. I think it'll help your query if you at least have index on col2 on table2. You could also try to rewrite your query like

;with cte1 as (
    select top (@m) t1.colX, t2.colY, t1.col3
    from Table1 as t1
        inner join Table2 as t2 on t1.col1=t2.col2
    order by t1.col3 asc
),
cte2 as (
    select top (@m - @n + 1) *
    from cte1
    order by col3 desc
)
select *
from cte2 as t1

but it could still be slow if you don't have indexes

Upvotes: 0

ErikE
ErikE

Reputation: 50282

Insert just the primary key column(s) of the paginated table into a temp table with an identity column, ordering by the ordered-by columns. (You may have to include the ordered-by columns to ensure the ordering comes out right.) Then, join back to the main table using the temp table as a key for the rows you want. If the data is fairly static, you could save the ordering data to a session-keyed permanent table instead of a temp table, and reuse it for a short period of time (so subsequent page requests within a few minutes are nearly instant).

Row_Number() tends to perform well with small sets of data, but it can hit serious performance snags once you get some serious rows, as you have with 500k.

Upvotes: 1

Related Questions