curious1
curious1

Reputation: 14717

SQL Server: pagination of "distinct" records

I have three tables and columns as below

resource: id, location.
subject: id, name
resource_subject: resource_id, subject_id

The relationship between "resource" and "subject" is many-to-many. Here is the query I have:

select r.* from resource r 
inner join resource_subject subject on r.id = subject.resource_id 
where (subject.subject_id= 2 or subject.subject_id= 4) 

Queries as the above may produce duplicate records because one resource may belong to more than one subject. So, to remove the duplicates, I use distinct as the following:

select distinct r.* from resource r 
inner join resource_subject subject on r.id = subject.resource_id 
where (subject.subject_id= 2 or subject.subject_id= 4) 

Now I want to do pagination of unique records. I read this SO post,

What is the best way to paginate results in SQL Server

I am interested in using ROW_NUMBER(). However, using ROW_NUMBER() would make duplicate records unique:

select distinct ROW_NUMBER() over( order by r.id asc) AS rownum, r.* from resource r 
inner join resource_subject subject on r.id = subject.resource_id 
where (subject.subject_id= 2 or subject.subject_id= 4) 

How can I do ROW_NUMBER() over records after "distinct"?

Upvotes: 2

Views: 3075

Answers (2)

Lukas Eder
Lukas Eder

Reputation: 220932

Use DENSE_RANK() instead

ROW_NUMBER() is to SELECT what DENSE_RANK() is to SELECT DISTINCT. The below application of DENSE_RANK() will produce row numbers as if they were produced after applying DISTINCT.

select distinct DENSE_RANK() over( order by r.id asc) AS rownum, r.* from resource r 
inner join resource_subject subject on r.id = subject.resource_id 
where (subject.subject_id= 2 or subject.subject_id= 4) 

How does it work?

This is best explained by example, also from the linked article

SELECT
  v, 
  ROW_NUMBER() OVER(ORDER BY v),
  RANK()       OVER(ORDER BY v),
  DENSE_RANK() OVER(ORDER BY v)
FROM (
  VALUES('a'),('a'),('a'),('b'),
        ('c'),('c'),('d'),('e')
) t(v);

The above yields:

| V | ROW_NUMBER | RANK | DENSE_RANK |
|---|------------|------|------------|
| a |          1 |    1 |          1 |
| a |          2 |    1 |          1 |
| a |          3 |    1 |          1 |
| b |          4 |    4 |          2 |
| c |          5 |    5 |          3 |
| c |          6 |    5 |          3 |
| d |          7 |    7 |          4 |
| e |          8 |    8 |          5 |

Upvotes: 2

Pரதீப்
Pரதீப்

Reputation: 93724

Generate the row_number in Outer Query

;WITH cte 
     AS (SELECT Row_number()OVER( ORDER BY id ASC) AS rownum, 
                * 
         FROM   (SELECT DISTINCT r.* 
                 FROM   resource r 
                        INNER JOIN resource_subject subject 
                                ON r.id = subject.resource_id 
                 WHERE  subject.subject_id IN ( 2, 4 )) A) 
SELECT * 
FROM   cte 
WHERE  rownum >= 1 
       AND rownum <= 10 

But if you are using Sql Server 2012+ then use OFFSET for pagination which will be faster than Row_number

SELECT DISTINCT r.* 
FROM   resource r 
       INNER JOIN resource_subject subject 
               ON r.id = subject.resource_id 
WHERE  subject.subject_id IN ( 2, 4 ) 
ORDER  BY r.id 
offset 1 rows FETCH next 10 rows only;

Upvotes: 1

Related Questions