Reputation: 1
I have data as below. Need to find the minimum date for a person is active on for the latest continuous period
SCENARIO 1
NAME | STARTDATE | END DATE
--------------------------------------
name | 01-JAN-2016 | 31-DEC-2017 name | 01-JAN-2017 | 31-OCT-2018 name | 01-JAN-2018 | 31-DEC-2019 name | 01-JAN-2019 | 31-DEC-2020
I need output as:
NAME | STARTDATE | END DATE
--------------------------------------
MIKE | 01-01-2018 | 31-12-2020
Scenario 2:-
NAME | STARTDATE | END DATE
--------------------------------------
name | 01-01-2016 | 31-DEC-2017 name | 01-01-2017 | 31-OCT-2018 name | 01-01-2018 | 31-DEC-2019 name | 01-01-2019 | 31-DEC-2020 I need output as:
NAME | STARTDATE | END DATE
--------------------------------------
name | 01-01-2019 | 31-12-2020
So basically output is MIN and MAX for LATEST continuous period for the person.
Upvotes: 0
Views: 787
Reputation: 1269763
Hmmm . . . I think you can do this with the following logic:
select name, max(startdate), max_enddate
from (select t.*,
lag(enddate) over (partition by name order by startdate) as prev_enddate,
max(enddate) over (partition by name) as max_enddate
from t
) t
where startdate <= prev_enddate + interval '1 day'
group by name, max_enddate;
The subquery simply gets the previous end date and the overrall end date.
The outer query does two things:
Here is a db<>fiddle
Upvotes: 1
Reputation: 222462
This is a gaps-and-island problem. Here is one approach that uses lag()
and a cumulative sum()
to build the groups, and then filter on the first group per name
:
select name, min(startdate) startdate, max(enddate) enddate
from (
select t.*,
sum(case when startdate = lag_enddate + interval '1 day' or lag_enddate is null then 0 else 1 end) over(partition by name order by startdate) grp
from (
select t.*,
lag(enddate) over(partition by name order by startdate) lag_enddate
from mytable t
) t
) t
where grp = 0
group by name
Upvotes: 0