Reputation: 1507
I am in need of a database select query solution for transforming a series of breakdown details into a summarized version of it, possibly in a view. Consider the table below. It has a composite primary key of (PK_1
, PK_2
, PK_3
, and SEQUENCE_NO
).
PK_1 PK_2 PK_3 SEQUENCE_NO STATUS_CODE
======== ==== ==== =========== ===========
20200421 A 1 1 Y
20200421 A 1 2 Y
20200421 A 1 3 Y
20200421 A 1 4 N
20200421 A 1 5 Y
20200421 A 1 6 Y
20200421 A 2 7 Y
20200421 A 2 8 Y
20200421 B 3 9 Y
20200421 B 3 10 Y
20200421 B 3 11 Y
20200422 B 3 11 Y
Only including all records with the STATUS_CODE
of "Y", how can I present the records in such a way consecutive records, respective to their composite primary keys, form ranges of value (indicated by SEQUENCE_FROM
and SEQUENCE_TO
, see below) while showing a gap that may indicate of a missing row, or a row with a STATUS_CODE
with a value other than "Y"?
PK_1 PK_2 PK_3 SEQUENCE_FROM SEQUENCE_TO
======== ==== ==== ============= ===========
20200421 A 1 1 3
20200421 A 1 5 6
20200421 A 2 7 8
20200421 B 3 9 11
20200422 B 3 11 11
I used MIN
and MAX
but obviously it wouldn't accomodate showing the gap between ranges.
Upvotes: 1
Views: 112
Reputation:
Here is one way to do this, using the match_recognize
clause (Oracle 12.1 and higher). If I understand correctly, a "gap" exists when one or more rows with status code 'N' exist - and only when such rows are preceded and followed by 'Y' rows for the same combination of pk_1, pk_2, pk_3
. So, in your sample data, there is only one such gap. You didn't explain exactly what you want to show (I assume a number, but you didn't explain how to compute it). I interpreted it to mean the difference between the starting value of the "current" sequence and the ending value of the "preceding" sequence.
with
yourtable (pk_1, pk_2, pk_3, sequence_no, status_code) as (
select 20200421, 'A', 1, 1, 'Y' from dual union all
select 20200421, 'A', 1, 2, 'Y' from dual union all
select 20200421, 'A', 1, 3, 'Y' from dual union all
select 20200421, 'A', 1, 4, 'N' from dual union all
select 20200421, 'A', 1, 5, 'Y' from dual union all
select 20200421, 'A', 1, 6, 'Y' from dual union all
select 20200421, 'A', 2, 7, 'Y' from dual union all
select 20200421, 'A', 2, 8, 'Y' from dual union all
select 20200421, 'B', 3, 9, 'Y' from dual union all
select 20200421, 'B', 3, 10, 'Y' from dual union all
select 20200421, 'B', 3, 11, 'Y' from dual union all
select 20200422, 'B', 3, 11, 'Y' from dual
)
select mr.*
, sequence_from - lag(sequence_to) over (partition by pk_1, pk_2, pk_3
order by sequence_from) as gap
from yourtable
match_recognize(
partition by pk_1, pk_2, pk_3
order by sequence_no
measures first(sequence_no) as sequence_from
, last (sequence_no) as sequence_to
pattern ( Y+ )
define Y as status_code = 'Y'
) mr
;
Output:
PK_1 PK_2 PK_3 SEQUENCE_FROM SEQUENCE_TO GAP
---------- ---- ---------- ------------- ----------- ----------
20200421 A 1 1 3
20200421 A 1 5 6 2
20200421 A 2 7 8
20200421 B 3 9 11
20200422 B 3 11 11
Upvotes: 1
Reputation: 7503
Try this, should work as expected. you can take a look at fiddle.
select
PK_1,
PK_2,
PK_3,
min(sequence_no) as SEQUENCE_FROM,
max(sequence_no) as SEQUENCE_TO
from
(
select
*,
sequence_no - row_number() over (partition by STATUS_CODE, PK_3 order by SEQUENCE_NO) as rnk
from myTable
where STATUS_CODE = 'Y'
) t
group by
PK_1,
PK_2,
PK_3,
rnk
Output:
pk_1 pk_2 pk_3 sequence_from sequence_to
---------------------------------------------
20200421 A 1 1 3
20200421 A 1 5 6
20200421 A 2 7 8
20200421 B 3 9 11
20200422 B 3 11 11
Upvotes: 1
Reputation: 520908
This is a gaps and islands problem. Here is one way to solve with ROW_NUMBER
, using the difference in row numbers method:
WITH cte AS (
SELECT t.*, SEQUENCE_NO -
ROW_NUMBER() OVER (PARTITION BY PK_1, PK_2, PK_3 ORDER BY SEQUENCE_NO) AS diff
FROM yourTable t
WHERE STATUS_CODE = 'Y'
)
SELECT
PK_1,
PK_2,
PK_3,
MIN(SEQUENCE_NO) AS SEQUENCE_FROM,
MAX(SEQUENCE_NO) AS SEQUENCE_TO
FROM cte
GROUP BY
PK_1,
PK_2,
PK_3,
(rn1 - rn2)
ORDER BY
PK_1,
PK_2,
PK_3;
The crux of the logic being used here is that we form an on-the-fly group number for each island, within each PK_1
, PK_2
, PK_3
partition, by taking the difference between the sequence number and a ROW_NUMBER
. This difference is guaranteed to always be unique for each island.
Upvotes: 2