Arty155
Arty155

Reputation: 113

Oracle Sort of gaps and island query

Instead of writing long sentences and paragraphs let me show the data and what I want to achieve :

create table ssb_price (itm_no varchar2(10), price number, price_code varchar2(10), valid_from_dt date, valid_to_dt date);

insert into ssb_price values ('A001', 83, 'AB', '01-JAN-21', '05-JAN-21');
insert into ssb_price values ('A001', 83, 'AB', '06-JAN-21', '12-JAN-21');
insert into ssb_price values ('A001', 98, 'SPQ', '13-JAN-21', '17-JAN-21');
insert into ssb_price values ('A001', 83, 'AB', '19-JAN-21', '24-JAN-21');
insert into ssb_price values ('A001', 83, 'DE', '25-JAN-21', '30-JAN-21');
insert into ssb_price values ('A001', 83, 'DE', '31-JAN-21', '04-FEB-21');
insert into ssb_price values ('A001', 77, 'XY', '07-FEB-21', '12-FEB-21');
insert into ssb_price values ('A001', 77, 'XY', '15-FEB-21', '20-FEB-21');
insert into ssb_price values ('A001', 62, 'SD', '23-FEB-21', '26-FEB-21');
insert into ssb_price values ('A001', 59, 'SD', '26-FEB-21', '03-MAR-21');

For particular itm_no and price if the from and to dates are continuous then I should get that value. For price 77 there is a gap of 2 days (13th and 14th) between to date and the next from date so its not continuous. Lemme paste what the desired output should look like :(taken the snip from excel)

expected_output

I have asked this question clubbed with another post. But that post was old and haven't got any feedback so creating this. Please let me know if I should merge this post with the previous one.

Upvotes: 1

Views: 229

Answers (2)

MT0
MT0

Reputation: 167804

From Oracle 12, you can use MATCH_RECOGNIZE:

SELECT itm_no,
       price,
       price_code,
       valid_from_dt,
       valid_to_dt,
       MIN( valid_from_dt ) OVER ( PARTITION BY itm_no, mnum ) AS new_valid_from_dt,
       MAX( valid_to_dt ) OVER ( PARTITION BY itm_no, mnum ) AS new_valid_to_dt
FROM   ssb_price
MATCH_RECOGNIZE(
  PARTITION BY itm_no
  ORDER     BY valid_from_dt, valid_to_dt
  MEASURES
    MATCH_NUMBER() AS mnum
  ALL ROWS PER MATCH
  PATTERN ( start_range continued_range* )
  DEFINE
    continued_range AS (
      valid_from_dt = PREV( valid_to_dt ) + 1
      AND price = PREV( price )
    )
)

and, from Oracle 10g, you can use the MODEL clause:

SELECT itm_no,
       price,
       price_code,
       valid_from_dt,
       valid_to_dt,
       mn,
       MIN( valid_from_dt ) OVER ( PARTITION BY itm_no, mn ) AS new_valid_from_dt,
       MAX( valid_to_dt ) OVER ( PARTITION BY itm_no, mn ) AS new_valid_to_dt
FROM   (
  SELECT *
  FROM   (
    SELECT s.*,
           ROW_NUMBER() OVER ( PARTITION BY itm_no ORDER BY valid_from_dt ) AS rn
    FROM   ssb_price s
  )
  MODEL
    PARTITION BY ( itm_no )
    DIMENSION BY ( rn )
    MEASURES ( price, price_code, valid_from_dt, valid_to_dt, 1 AS mn )
    RULES (
      mn[rn>1] = mn[cv(rn)-1]
                 +
                 CASE
                 WHEN valid_from_dt[cv(rn)] = valid_to_dt[cv(rn)-1] + 1
                 AND  price[cv(rn)] = price[cv(rn) - 1]
                 THEN 0
                 ELSE 1
                 END
    )
)

Which, for the sample data:

create table ssb_price (itm_no, price, price_code, valid_from_dt, valid_to_dt) AS
SELECT 'A001', 83, 'AB', DATE '2021-01-01', DATE '2021-01-05' FROM DUAL UNION ALL
SELECT 'A001', 83, 'AB', DATE '2021-01-06', DATE '2021-01-12' FROM DUAL UNION ALL
SELECT 'A001', 98, 'SPQ', DATE '2021-01-13', DATE '2021-01-17' FROM DUAL UNION ALL
SELECT 'A001', 83, 'AB', DATE '2021-01-19', DATE '2021-01-24' FROM DUAL UNION ALL
SELECT 'A001', 83, 'DE', DATE '2021-01-25', DATE '2021-01-30' FROM DUAL UNION ALL
SELECT 'A001', 83, 'DE', DATE '2021-01-31', DATE '2021-02-04' FROM DUAL UNION ALL
SELECT 'A001', 77, 'XY', DATE '2021-02-07', DATE '2021-02-12' FROM DUAL UNION ALL
SELECT 'A001', 77, 'XY', DATE '2021-02-15', DATE '2021-02-20' FROM DUAL UNION ALL
SELECT 'A001', 62, 'SD', DATE '2021-02-23', DATE '2021-02-26' FROM DUAL UNION ALL
SELECT 'A001', 59, 'SD', DATE '2021-02-26', DATE '2021-03-03' FROM DUAL;

Outputs:

ITM_NO PRICE PRICE_CODE VALID_FROM_DT VALID_TO_DT NEW_VALID_FROM_DT NEW_VALID_TO_DT
A001 83 AB 2021-01-01 00:00:00 2021-01-05 00:00:00 2021-01-01 00:00:00 2021-01-12 00:00:00
A001 83 AB 2021-01-06 00:00:00 2021-01-12 00:00:00 2021-01-01 00:00:00 2021-01-12 00:00:00
A001 98 SPQ 2021-01-13 00:00:00 2021-01-17 00:00:00 2021-01-13 00:00:00 2021-01-17 00:00:00
A001 83 AB 2021-01-19 00:00:00 2021-01-24 00:00:00 2021-01-19 00:00:00 2021-02-04 00:00:00
A001 83 DE 2021-01-25 00:00:00 2021-01-30 00:00:00 2021-01-19 00:00:00 2021-02-04 00:00:00
A001 83 DE 2021-01-31 00:00:00 2021-02-04 00:00:00 2021-01-19 00:00:00 2021-02-04 00:00:00
A001 77 XY 2021-02-07 00:00:00 2021-02-12 00:00:00 2021-02-07 00:00:00 2021-02-12 00:00:00
A001 77 XY 2021-02-15 00:00:00 2021-02-20 00:00:00 2021-02-15 00:00:00 2021-02-20 00:00:00
A001 62 SD 2021-02-23 00:00:00 2021-02-26 00:00:00 2021-02-23 00:00:00 2021-02-26 00:00:00
A001 59 SD 2021-02-26 00:00:00 2021-03-03 00:00:00 2021-02-26 00:00:00 2021-03-03 00:00:00

db<>fiddle here

Upvotes: 1

Gordon Linoff
Gordon Linoff

Reputation: 1269493

This is basically a gaps-and-islands problem. But instead of aggregating to reduce the number of rows, you want to use window functions at the last step.

In your data, the time frames neatly tile. That suggests using lag() and a cumulative sum to define the groups:

select p.*,
       min(valid_from_dt) over (partition by itm_no, price, price_code, grp) as new_valid_from_dt,
       max(valid_to_dt) over (partition by itm_no, price, price_code, grp) as new_valid_to_dt
from (select p.*,
             sum(case when valid_from_dt = prev_valid_to_dt + interval '1' day then 0 else 1 end) over 
                   (partition by itm_no, price, price_code order by valid_from_dt) as grp
      from (select p.*,
                   lag(valid_to_dt) over (partition by itm_no, price, price_code order by valid_from_dt) as prev_valid_to_dt
            from ssb_price p 
           ) p
     ) p
order by itm_no, valid_from_dt;

Here is a db<>fiddle.

Upvotes: 2

Related Questions