Jeff
Jeff

Reputation: 719

Oracle SQL ROW_NUMBER() window with conditions based on lag

Using Oracle SQL with select only privileges, I need to provide ROW_NUMBER output based on conditions. using a cursor or a loop this would be easy rather simple, but presently I must perform this task using only SQL.

I've been tinkering with a row_number() over clause which I believe is the right way to go, but I'm stuck now.

My current code - or at least a proxy of it:

    WITH MYTABLE (FK_ID,FK_NAME,PK_ID,BIN_FLAG,MONTH,YEAR)AS (
      SELECT 10000,'VARCHAR DESCRIPTION',75057,1,1,2016 FROM DUAL
UNION SELECT 10000,'VARCHAR DESCRIPTION',75057,1,2,2016 FROM DUAL
UNION SELECT 10000,'VARCHAR DESCRIPTION',75057,1,3,2016 FROM DUAL
UNION SELECT 10000,'VARCHAR DESCRIPTION',75057,0,4,2016 FROM DUAL
UNION SELECT 10000,'VARCHAR DESCRIPTION',75057,1,5,2016 FROM DUAL
UNION SELECT 10000,'VARCHAR DESCRIPTION',75057,0,6,2016 FROM DUAL
UNION SELECT 10000,'VARCHAR DESCRIPTION',75057,0,7,2016 FROM DUAL
UNION SELECT 10000,'VARCHAR DESCRIPTION',75057,1,8,2016 FROM DUAL
UNION SELECT 10000,'VARCHAR DESCRIPTION',75057,0,9,2016 FROM DUAL
UNION SELECT 10000,'VARCHAR DESCRIPTION',75057,0,10,2016 FROM DUAL
UNION SELECT 10000,'VARCHAR DESCRIPTION',75057,1,11,2016 FROM DUAL
UNION SELECT 10000,'VARCHAR DESCRIPTION',75057,0,12,2016 FROM DUAL
UNION SELECT 10000,'VARCHAR DESCRIPTION',75057,0,1,2017 FROM DUAL
UNION SELECT 10000,'VARCHAR DESCRIPTION',75057,0,2,2017 FROM DUAL
UNION SELECT 10000,'VARCHAR DESCRIPTION',75057,0,3,2017 FROM DUAL
UNION SELECT 10000,'VARCHAR DESCRIPTION',75057,0,4,2017 FROM DUAL
UNION SELECT 10000,'VARCHAR DESCRIPTION',75057,0,5,2017 FROM DUAL
UNION SELECT 10000,'VARCHAR DESCRIPTION',75057,0,6,2017 FROM DUAL
UNION SELECT 10000,'VARCHAR DESCRIPTION',75057,0,7,2017 FROM DUAL

)

SELECT
   FK_ID
,  FK_NAME
,  PK_ID
,  BIN_FLAG
,  ROW_NUMBER() OVER (PARTITION BY FK_ID,PK_ID,BIN_FLAG ORDER BY YEAR,MONTH,FK_ID,PK_ID ASC) NEEDED_CALC
,  MONTH
,  YEAR
FROM MYTABLE
ORDER BY FK_ID,PK_ID,YEAR,MONTH

This will return a dataset that looks something like this:

FK_ID   FK_NAME             PK_ID       BIN_FLAG    NEEDED_CALC MONTH   YEAR
10000   VARCHAR DESCRIPTION 75057       1           1           1       2016
10000   VARCHAR DESCRIPTION 75057       1           2           2       2016
10000   VARCHAR DESCRIPTION 75057       1           3           3       2016
10000   VARCHAR DESCRIPTION 75057       0           1           4       2016
10000   VARCHAR DESCRIPTION 75057       1           4           5       2016
10000   VARCHAR DESCRIPTION 75057       0           2           6       2016
10000   VARCHAR DESCRIPTION 75057       0           3           7       2016
10000   VARCHAR DESCRIPTION 75057       1           5           8       2016
10000   VARCHAR DESCRIPTION 75057       0           4           9       2016
10000   VARCHAR DESCRIPTION 75057       0           5           10      2016
10000   VARCHAR DESCRIPTION 75057       1           6           11      2016
10000   VARCHAR DESCRIPTION 75057       0           6           12      2016
10000   VARCHAR DESCRIPTION 75057       0           7           1       2017
10000   VARCHAR DESCRIPTION 75057       0           8           2       2017
10000   VARCHAR DESCRIPTION 75057       0           9           3       2017
10000   VARCHAR DESCRIPTION 75057       0           10          4       2017
10000   VARCHAR DESCRIPTION 75057       0           11          5       2017
10000   VARCHAR DESCRIPTION 75057       0           12          6       2017
10000   VARCHAR DESCRIPTION 75057       0           13          7       2017

What I need is the NEEDED_CALC column to reset the calc anytime the prior month's bin_flag changes.

So, if bin_flag = 1 and the prior month's bin_flag (identified through a lag function) is different, then the counter column in NEEDED_CALC will reset and begin again at 1.

Upvotes: 3

Views: 2530

Answers (2)

James
James

Reputation: 3411

I you're lucky enough to be working on a 12c database, you can do this using match recognize clause. In this case I've told the match engine to look for groups of 0 or more rows where the bin_flag is equal to bin_flag on the previous row, using the same ordering as in the question.

WITH MYTABLE (FK_ID,FK_NAME,PK_ID,BIN_FLAG,MONTH,YEAR)AS (
 <<<as from the question>>>> 
)
SELECT
 FK_ID 
,FK_NAME
,PK_ID
,BIN_FLAG
,seq NEEDED_CALC
,MONTH
,YEAR
FROM MYTABLE
MATCH_RECOGNIZE (
  ORDER BY year,month,fk_id,pk_id asc
  MEASURES
    count(*) +1 seq 
  ALL ROWS PER MATCH
  PATTERN (a*)
  DEFINE 
  a AS bin_flag = prev(bin_flag)
  )
ORDER BY FK_ID,PK_ID,YEAR,MONTH

Match Recognize can be a useful tool for this type of query where you're looking for patterns across groups of rows.

Upvotes: 1

Gordon Linoff
Gordon Linoff

Reputation: 1269993

This is a "gaps and islands" problem. The key is assigning a group identifier to adjacent rows with the same value. There are two easy methods to do this: one is based on lag() and the other is the difference of row_number()s.

The second is a wee bit simpler, only requiring one level of subquery:

select t.*,
       row_number() over (partition by fk_id, bin_flag, seqnum_ym - seqnum_bym
                          order by year, month
                         ) as needed_calc
from (select t.*,
             row_number() over (partition by fk_id order by year, month) as seqnum_ym,
             row_number() over (partition by fk_id order by bin_flag, year, month) as seqnum_bym
      from mytable t
     ) t;

The difference of row numbers is not hard to understand but it does require a conceptual leap. I would suggest that you run the subquery and see what the values are for seqnum_ym and seqnum_bym to understand how it works.

Upvotes: 4

Related Questions