Alexandre
Alexandre

Reputation: 13308

SQL query - strange behaviour

DECLARE @OrdersTemp TABLE
(
   OrderId UNIQUEIDENTIFIER
)

INSERT INTO @OrdersTemp
  SELECT ord.Id
  FROM Orders 

--all rows count
SELECT  
    @RowsCount = COUNT(DISTINCT ord.Id)
FROM   Orders 


--@RowsCount = 5. It's right!

--second table with paging
DECLARE @OrdersTempWithPaging TABLE
(
    OrderId UNIQUEIDENTIFIER
)

INSERT INTO @OrdersTempWithPaging
   SELECT OrderId
   FROM  (SELECT DISTINCT OrderId,
                 ROW_NUMBER() OVER (ORDER BY OrderId) AS RowNum
          FROM @OrdersTemp) AS alias
          WHERE 
              RowNum BETWEEN (@PageIndex - 1) * @PageSize + 1
                     AND @PageIndex * @PageSize

SELECT * FROM @OrdersTempWithPaging
--10 or more rows. It's wrong.

Why does @OrdersTempWithPaging return wrong amount of rows? How do I avoid it?

UPDATE: The statement below returns 25 = 5*5 rows (instead of 5)

   INSERT  INTO @OrdersTempWithPaging
                SELECT    OrderId
                FROM    (
                          SELECT    OrderId,
                                    ROW_NUMBER() OVER (ORDER BY OrderId ) AS RowNum
                          FROM      @OrdersTemp ) AS alias
                --WHERE   RowNum BETWEEN ( @PageIndex - 1 ) * @PageSize + 1
                             --  AND     @PageIndex * @PageSize
  SELECT * FROM @OrdersTempWithPaging

Upvotes: 1

Views: 408

Answers (3)

Andriy M
Andriy M

Reputation: 77657

Instead of

SELECT DISTINCT OrderId,
       ROW_NUMBER() OVER (ORDER BY OrderId) AS RowNum
FROM @OrdersTemp

use

SELECT OrderId,
       ROW_NUMBER() OVER (ORDER BY OrderId) AS RowNum
FROM @OrdersTemp
GROUP BY OrderId

This is an interesting case of another difference between SELECT DISTINCT and SELECT GROUP BY, which manifests itself when the select list includes a ranking function.

In the first query the output includes duplicate OrderId values from @OrdersTemp because the ranking function is evaluated before DISTINCT is applied . In contrast, the second query first groups the rows by OrderId (i.e. effectively selects distinct OrderId values first) and then applies ranking.

Upvotes: 0

ypercubeᵀᴹ
ypercubeᵀᴹ

Reputation: 115520

Try this (reversing the DISTINCT use):

INSERT INTO @OrdersTempWithPaging
   SELECT DISTINCT OrderId
   FROM  (SELECT OrderId,
                 ROW_NUMBER() OVER (ORDER BY OrderId) AS RowNum
          FROM @OrdersTemp) AS alias
   WHERE 
       RowNum BETWEEN (@PageIndex - 1) * @PageSize + 1
              AND @PageIndex * @PageSize

If you need only distinct order-ids, you could have:

INSERT INTO @OrdersTemp
  SELECT DISTINCT ord.Id
  FROM Orders 

and then:

INSERT INTO @OrdersTempWithPaging
   SELECT OrderId
   FROM  (SELECT OrderId,
                 ROW_NUMBER() OVER (ORDER BY OrderId) AS RowNum
          FROM @OrdersTemp) AS alias
   WHERE 
       RowNum BETWEEN (@PageIndex - 1) * @PageSize + 1
              AND @PageIndex * @PageSize

Upvotes: 2

BitKFu
BitKFu

Reputation: 3697

It's because your ordering inside the select,

 SELECT    DISTINCT OrderId,
           ROW_NUMBER() OVER (ORDER BY OrderId ) AS RowNumber

You have to coose an ordering over a column where you don't have to use DISTINCT in the selection.

 SELECT    OrderId,
           ROW_NUMBER() OVER (ORDER BY OrderId ) AS RowNumber

Try it, without DISTINCT

Upvotes: 3

Related Questions