Flying Turtle
Flying Turtle

Reputation: 364

Rank by groups sql server

The problem seems simple but I can't get my head around it, this is for sql server

 what I have in a table :            What I need as a output .
 cksum          id                      cksum       id
-2162514679     204                    -2162514679  204    1
-2162514679     207                    -2162514679  207    1
-2162514679     215                    -2162514679  215    1
-2162514679     218                    -2162514679  218    1
-2162514679     221                    -2162514679  221    1
-2160286363     257                    -2160286363  257    2
-2160286363     260                    -2160286363  260    2
-2160286363     332                    -2160286363  332    2
-2162514679     335                    -2162514679  335    3
-2162514679     338                    -2162514679  338    3
-2126731931     348                    -2126731931  348    4
-2126731931     387                    -2126731931  387    4

The table is ordered by id, I need a rank that follows the id column but that groups the cksum, notice that the cksum can return to a previous value yet still have its rank because of the ID (this is the case for the value 2162514679, it apears 5 times at the begining and a second time more bellow and they constitude two different ranks). I've been at it for a couple of hours and it seems really stupid like using a row_number with a partition or using a CTE but nope.... can't find the peice of logic to do this... anyone has the answer?

Upvotes: 4

Views: 490

Answers (2)

Gordon Linoff
Gordon Linoff

Reputation: 1269793

This is a bit tricky. You can get the grouping for the ids with a trick -- a difference of row numbers. Then you need to get the minimum id for each group, to ensure that the final rank is in the right order. Then you can use then you can use dense rank:

select cksum, id, dense_rank() over (order by minid)
from (select t.*, min(id) over (partition by cksum, grp) as minid
      from (select t.*,
                   (row_number() over (order by id) -
                    row_number() over (partition by cksum order by id)
                   ) as grp
            from table t
           ) t
     ) t;

Upvotes: 3

Giorgos Betsos
Giorgos Betsos

Reputation: 72175

This is a different approach, which involves simulating LAG window function not available in SQL Server 2008 R2:

;WITH CTE_RN AS (
   SELECT cksum, id, ROW_NUMBER() OVER(ORDER BY id) AS rn
   FROM Checksums
), CTE_LAG AS (
   SELECT c1.cksum, c1.id, c1.rn,
          (CASE WHEN c2.cksum IS NULL OR c1.cksum = c2.cksum THEN 0
                ELSE 1
           END) AS flag 
   FROM CTE_RN AS c1
   LEFT JOIN CTE_RN AS c2 ON c1.rn = c2.rn+1
)
SELECT cksum, id,  (SELECT SUM(flag)
                    FROM CTE_LAG AS t2        
                    WHERE t2.rn <= t1.rn) + 1 AS [rank]     
FROM CTE_LAG AS t1

CTE_LAG returns the following result set (based on the sample data of the OP):

cksum       id  rn  flag
-------------------------
-2162514679 204 1   0
-2162514679 207 2   0
-2162514679 215 3   0
-2162514679 218 4   0
-2162514679 221 5   0
-2160286363 257 6   1
-2160286363 260 7   0
-2160286363 332 8   0
-2162514679 335 9   1
-2162514679 338 10  0
-2126731931 348 11  1
-2126731931 387 12  0

Field flag is equal to 1 if current cksum is not equal to the previous cksum, otherwise flag is equal to 0.

Field rank is simply the running total of flag.

Upvotes: 1

Related Questions