Bob
Bob

Reputation: 227

SQL - Selecting rows based on date difference

Suppose we have below table:

Code Dt
c1 2020-10-01
c1 2020-10-05
c1 2020-10-09
c1 2020-10-10
c1 2020-10-20
c2 2020-10-07
c2 2020-10-09
c2 2020-10-15
c2 2020-10-16
c2 2020-10-20
c2 2020-10-24

Combination of Code and Dt is unique. Rows are sorted based on Code and Dt. Database is Oracle 12. For every code, I want to get list of its Dts that each Dt is grater than 7 days compared to previously selected Dt. Therefore, result should be:

Code Dt
c1 2020-10-01
c1 2020-10-09
c1 2020-10-20
c2 2020-10-07
c2 2020-10-15
c2 2020-10-24

I've tried self join based on row_number() to join every row with its previous row if date difference is grater than 7. But there is a challenge that each row should be compared with previously selected row and not its previous row in table. Any solutions? Thanks

Upvotes: 1

Views: 119

Answers (2)

Petr
Petr

Reputation: 540

You can solve this relatively easily using match_recognize

with data(code, dt) as (
  select 'c1',  to_date('2020-10-01', 'YYYY-MM-DD') from dual union all
  select 'c1',  to_date('2020-10-05', 'YYYY-MM-DD') from dual union all
  select 'c1',  to_date('2020-10-09', 'YYYY-MM-DD') from dual union all
  select 'c1',  to_date('2020-10-10', 'YYYY-MM-DD') from dual union all
  select 'c1',  to_date('2020-10-20', 'YYYY-MM-DD') from dual union all
  select 'c2',  to_date('2020-10-07', 'YYYY-MM-DD') from dual union all
  select 'c2',  to_date('2020-10-09', 'YYYY-MM-DD') from dual union all
  select 'c2',  to_date('2020-10-15', 'YYYY-MM-DD') from dual union all
  select 'c2',  to_date('2020-10-16', 'YYYY-MM-DD') from dual union all
  select 'c2',  to_date('2020-10-20', 'YYYY-MM-DD') from dual union all
  select 'c2',  to_date('2020-10-24', 'YYYY-MM-DD') from dual
)
select *
from data match_recognize (
  partition by code
  order by dt
  measures
   init.dt dt
  one row per match
  pattern (init less_than_7_days*)
  define
    less_than_7_days as less_than_7_days.dt - init.dt < 7
)

You just partition by code, order by dates and then get any row init and 0-many following rows (less_than_7_days*) that have date difference less than 7 (compared with init). You return 1 row for this whole match (init + following rows) that will contain date from init

Upvotes: 5

Serg
Serg

Reputation: 22811

Looks like a case for a hierahical query. Compute pairs and traverse the chain

with pairs(Code, Dt, dtnext) as (
  select t1.Code, t1.dt, Min(t2.dt)
  from tbl t1
  join tbl t2 on t1.code=t2.code and t2.dt >= t1.dt + INTERVAL '7' DAY
  group by t1.Code, t1.dt
),
h(Code, Dtn) as (
  select Code,  Min(dt)
  from tbl
  group by Code
  union all
  select h.Code, p.dtnext
  from h
  join pairs p on p.code=h.code and p.Dt= h.dtn
  )
select * 
from h
order by code, dtn

The fiddle

Returns

CODE    DTN
c1  01-OCT-20
c1  09-OCT-20
c1  20-OCT-20
c2  07-OCT-20
c2  15-OCT-20
c2  24-OCT-20

Upvotes: 1

Related Questions