Reputation: 53
I'm working on a problem in SQL where I'm trying to rank a column based on when it changes.
Basically the current table looks like this:
DATE STATUS
10/18/10 A
10/16/10 A
10/14/10 B
10/12/10 A
I want to appropriately rank the status column, based on the date column, but only when the status column changes. For example:
DATE STATUS RANK
10/18/10 A 1
10/16/10 A 1
10/14/10 B 2
10/12/10 A 3
10/10/10 A 3
Any ideas on how to go about this? I've played around with both RANK() and DENSE_RANK() and am having trouble getting the output I want. Thanks.
Upvotes: 1
Views: 2159
Reputation: 5332
DB2 is one of the SQL DBMSes that support the LAG
and LEAD
OLAP windowing functions, which make it possible to compare an expression in the current row against other rows in the same partitioned window. Neither RANK
nor DENSE_RANK
are an exact fit for the calculation you want (a running count of status changes), but that can be accomplished with SUM
and a binary expression:
WITH StatusHistory( histDate, statusCode ) AS (
VALUES
( DATE( '2010-10-10' ), 'A' ),
( DATE( '2010-10-12' ), 'A' ),
( DATE( '2010-10-14' ), 'B' ),
( DATE( '2010-10-16' ), 'A' ),
( DATE( '2010-10-18' ), 'A' )
)
SELECT histDate,
statusCode,
SMALLINT(
SUM(
CASE LAG( statusCode, 1 )
OVER ( ORDER BY histDate DESC)
WHEN statusCode
THEN 0
ELSE 1
END
) OVER ( ORDER BY histDate DESC )
) AS changeSeq
FROM StatusHistory
ORDER BY histDate DESC
;
HISTDATE STATUSCODE CHANGESEQ
---------- ---------- ---------
10/18/2010 A 1
10/16/2010 A 1
10/14/2010 B 2
10/12/2010 A 3
10/10/2010 A 3
5 record(s) selected.
Upvotes: 1
Reputation: 1269753
There are a couple of ways to do this. A simple way is simply to count the number of values different from the current value, using a subquery. Another way is to use the difference of row numbers. Both provide a group identifier, which you then need to work a bit harder to get the ranking you want:
select t.date, t.status, dense_rank() over (order by mindate) as ranking
from (select t.*, min(date) over (partition by grp, status) as mindate
from (select t.*,
(row_number() over (order by date) -
row_number() over (partition by status order by date)
) as grp
from table t
) t
) t;
Depending on the ordering you might want an order by mindate desc
in the outer query.
Upvotes: 1