Reputation: 719
Using Oracle SQL with select only privileges, I need to provide ROW_NUMBER
output based on conditions. using a cursor or a loop this would be easy rather simple, but presently I must perform this task using only SQL.
I've been tinkering with a row_number()
over
clause which I believe is the right way to go, but I'm stuck now.
My current code - or at least a proxy of it:
WITH MYTABLE (FK_ID,FK_NAME,PK_ID,BIN_FLAG,MONTH,YEAR)AS (
SELECT 10000,'VARCHAR DESCRIPTION',75057,1,1,2016 FROM DUAL
UNION SELECT 10000,'VARCHAR DESCRIPTION',75057,1,2,2016 FROM DUAL
UNION SELECT 10000,'VARCHAR DESCRIPTION',75057,1,3,2016 FROM DUAL
UNION SELECT 10000,'VARCHAR DESCRIPTION',75057,0,4,2016 FROM DUAL
UNION SELECT 10000,'VARCHAR DESCRIPTION',75057,1,5,2016 FROM DUAL
UNION SELECT 10000,'VARCHAR DESCRIPTION',75057,0,6,2016 FROM DUAL
UNION SELECT 10000,'VARCHAR DESCRIPTION',75057,0,7,2016 FROM DUAL
UNION SELECT 10000,'VARCHAR DESCRIPTION',75057,1,8,2016 FROM DUAL
UNION SELECT 10000,'VARCHAR DESCRIPTION',75057,0,9,2016 FROM DUAL
UNION SELECT 10000,'VARCHAR DESCRIPTION',75057,0,10,2016 FROM DUAL
UNION SELECT 10000,'VARCHAR DESCRIPTION',75057,1,11,2016 FROM DUAL
UNION SELECT 10000,'VARCHAR DESCRIPTION',75057,0,12,2016 FROM DUAL
UNION SELECT 10000,'VARCHAR DESCRIPTION',75057,0,1,2017 FROM DUAL
UNION SELECT 10000,'VARCHAR DESCRIPTION',75057,0,2,2017 FROM DUAL
UNION SELECT 10000,'VARCHAR DESCRIPTION',75057,0,3,2017 FROM DUAL
UNION SELECT 10000,'VARCHAR DESCRIPTION',75057,0,4,2017 FROM DUAL
UNION SELECT 10000,'VARCHAR DESCRIPTION',75057,0,5,2017 FROM DUAL
UNION SELECT 10000,'VARCHAR DESCRIPTION',75057,0,6,2017 FROM DUAL
UNION SELECT 10000,'VARCHAR DESCRIPTION',75057,0,7,2017 FROM DUAL
)
SELECT
FK_ID
, FK_NAME
, PK_ID
, BIN_FLAG
, ROW_NUMBER() OVER (PARTITION BY FK_ID,PK_ID,BIN_FLAG ORDER BY YEAR,MONTH,FK_ID,PK_ID ASC) NEEDED_CALC
, MONTH
, YEAR
FROM MYTABLE
ORDER BY FK_ID,PK_ID,YEAR,MONTH
This will return a dataset that looks something like this:
FK_ID FK_NAME PK_ID BIN_FLAG NEEDED_CALC MONTH YEAR
10000 VARCHAR DESCRIPTION 75057 1 1 1 2016
10000 VARCHAR DESCRIPTION 75057 1 2 2 2016
10000 VARCHAR DESCRIPTION 75057 1 3 3 2016
10000 VARCHAR DESCRIPTION 75057 0 1 4 2016
10000 VARCHAR DESCRIPTION 75057 1 4 5 2016
10000 VARCHAR DESCRIPTION 75057 0 2 6 2016
10000 VARCHAR DESCRIPTION 75057 0 3 7 2016
10000 VARCHAR DESCRIPTION 75057 1 5 8 2016
10000 VARCHAR DESCRIPTION 75057 0 4 9 2016
10000 VARCHAR DESCRIPTION 75057 0 5 10 2016
10000 VARCHAR DESCRIPTION 75057 1 6 11 2016
10000 VARCHAR DESCRIPTION 75057 0 6 12 2016
10000 VARCHAR DESCRIPTION 75057 0 7 1 2017
10000 VARCHAR DESCRIPTION 75057 0 8 2 2017
10000 VARCHAR DESCRIPTION 75057 0 9 3 2017
10000 VARCHAR DESCRIPTION 75057 0 10 4 2017
10000 VARCHAR DESCRIPTION 75057 0 11 5 2017
10000 VARCHAR DESCRIPTION 75057 0 12 6 2017
10000 VARCHAR DESCRIPTION 75057 0 13 7 2017
What I need is the NEEDED_CALC
column to reset the calc anytime the prior month's bin_flag
changes.
So, if bin_flag = 1
and the prior month's bin_flag
(identified through a lag
function) is different, then the counter column in NEEDED_CALC
will reset and begin again at 1
.
Upvotes: 3
Views: 2530
Reputation: 3411
I you're lucky enough to be working on a 12c database, you can do this using match recognize clause. In this case I've told the match engine to look for groups of 0 or more rows where the bin_flag is equal to bin_flag on the previous row, using the same ordering as in the question.
WITH MYTABLE (FK_ID,FK_NAME,PK_ID,BIN_FLAG,MONTH,YEAR)AS (
<<<as from the question>>>>
)
SELECT
FK_ID
,FK_NAME
,PK_ID
,BIN_FLAG
,seq NEEDED_CALC
,MONTH
,YEAR
FROM MYTABLE
MATCH_RECOGNIZE (
ORDER BY year,month,fk_id,pk_id asc
MEASURES
count(*) +1 seq
ALL ROWS PER MATCH
PATTERN (a*)
DEFINE
a AS bin_flag = prev(bin_flag)
)
ORDER BY FK_ID,PK_ID,YEAR,MONTH
Match Recognize can be a useful tool for this type of query where you're looking for patterns across groups of rows.
Upvotes: 1
Reputation: 1269993
This is a "gaps and islands" problem. The key is assigning a group identifier to adjacent rows with the same value. There are two easy methods to do this: one is based on lag()
and the other is the difference of row_number()
s.
The second is a wee bit simpler, only requiring one level of subquery:
select t.*,
row_number() over (partition by fk_id, bin_flag, seqnum_ym - seqnum_bym
order by year, month
) as needed_calc
from (select t.*,
row_number() over (partition by fk_id order by year, month) as seqnum_ym,
row_number() over (partition by fk_id order by bin_flag, year, month) as seqnum_bym
from mytable t
) t;
The difference of row numbers is not hard to understand but it does require a conceptual leap. I would suggest that you run the subquery and see what the values are for seqnum_ym
and seqnum_bym
to understand how it works.
Upvotes: 4