Reputation: 50
I need to identify overlapping rows. Below is the table and I need the column overlapping:
| identifier | status | startDate | endDate | pID | OVERLAPPING |
|------------ |---------- |------------ |------------ |----- |------------- |
| A | Approved | 2020-10-01 | 2020-10-07 | x1 | Yes |
| A | Approved | 2020-10-01 | 2020-10-07 | x2 | No |
| A | Approved | 2020-10-01 | 2020-10-07 | x3 | Yes |
| A | Approved | 2020-10-01 | 2020-10-07 | x4 | No |
| B | Approved | 2020-10-10 | 2020-10-12 | x2 | No |
| B | Approved | 2020-10-10 | 2020-10-12 | x5 | No |
| C | Rejected | 2020-10-05 | 2020-10-06 | x3 | No |
| C | Rejected | 2020-10-05 | 2020-10-06 | x7 | No |
| C | Rejected | 2020-10-05 | 2020-10-06 | x8 | No |
| C | Rejected | 2020-10-05 | 2020-10-06 | x9 | No |
| D | Approved | 2020-10-01 | 2020-10-07 | x5 | No |
| D | Approved | 2020-10-01 | 2020-10-07 | x1 | Yes |
| E | Approved | 2020-10-03 | 2020-10-04 | x3 | Yes |
I am able to get this if start date and end dates are same by creating a column which is combination of start date, end date and pID and then counting this field for all rows. If it is more than 1 then I am marking overlapping. But this is not covering scenario of x3 where start date and date are not same but still overlapping for some duration.
Upvotes: 1
Views: 252
Reputation: 6749
Try this one (I use a Boolean with true and false, though ...
WITH
input(identifier,status,startDate,endDate,pID,OVERLAPPING) AS (
SELECT 'A','Approved',DATE '2020-10-01',DATE '2020-10-07','x1','Yes'
UNION ALL SELECT 'A','Approved',DATE '2020-10-01',DATE '2020-10-07','x2','No'
UNION ALL SELECT 'A','Approved',DATE '2020-10-01',DATE '2020-10-07','x3','Yes'
UNION ALL SELECT 'A','Approved',DATE '2020-10-01',DATE '2020-10-07','x4','No'
UNION ALL SELECT 'B','Approved',DATE '2020-10-10',DATE '2020-10-12','x2','No'
UNION ALL SELECT 'B','Approved',DATE '2020-10-10',DATE '2020-10-12','x5','No'
UNION ALL SELECT 'C','Rejected',DATE '2020-10-05',DATE '2020-10-06','x3','No'
UNION ALL SELECT 'C','Rejected',DATE '2020-10-05',DATE '2020-10-06','x7','No'
UNION ALL SELECT 'C','Rejected',DATE '2020-10-05',DATE '2020-10-06','x8','No'
UNION ALL SELECT 'C','Rejected',DATE '2020-10-05',DATE '2020-10-06','x9','No'
UNION ALL SELECT 'D','Approved',DATE '2020-10-01',DATE '2020-10-07','x5','No'
UNION ALL SELECT 'D','Approved',DATE '2020-10-01',DATE '2020-10-07','x1','Yes'
UNION ALL SELECT 'E','Approved',DATE '2020-10-03',DATE '2020-10-04','x3','Yes'
)
SELECT
*
, status = 'Approved'
AND (
COALESCE(LAG(enddate) OVER(w) ,'0001-01-01')> startdate
OR COALESCE(LEAD(startdate) OVER(w) ,'9999-12-31')< enddate
) AS overlap
FROM input
WINDOW w AS (PARTITION BY pid ORDER BY startdate)
ORDER BY
identifier
, startdate
;
-- out identifier|status |startDate |endDate |pID|OVERLAPPING|overlap
-- out A |Approved|2020-10-01|2020-10-07|x3 |Yes |true
-- out A |Approved|2020-10-01|2020-10-07|x2 |No |false
-- out A |Approved|2020-10-01|2020-10-07|x4 |No |false
-- out A |Approved|2020-10-01|2020-10-07|x1 |Yes |true
-- out B |Approved|2020-10-10|2020-10-12|x5 |No |false
-- out B |Approved|2020-10-10|2020-10-12|x2 |No |false
-- out C |Rejected|2020-10-05|2020-10-06|x3 |No |false
-- out C |Rejected|2020-10-05|2020-10-06|x8 |No |false
-- out C |Rejected|2020-10-05|2020-10-06|x7 |No |false
-- out C |Rejected|2020-10-05|2020-10-06|x9 |No |false
-- out D |Approved|2020-10-01|2020-10-07|x1 |Yes |true
-- out D |Approved|2020-10-01|2020-10-07|x5 |No |false
-- out E |Approved|2020-10-03|2020-10-04|x3 |Yes |true
Upvotes: 0
Reputation: 60513
Based on your narrative this seems to match your logic:
select *,
case
when status = 'Rejected' then 'No'
-- previous row overlaps
when startDate < -- maybe <=
max(case when status <> 'Rejected' then endDate end)
over (partition by pID
order by startDate, endDate desc
rows between unbounded preceding and 1 preceding)
-- next row overlaps
or endDate > -- maybe >=
min(case when status <> 'Rejected' then startDate end)
over (partition by pID
order by startDate, endDate desc
rows between 1 following and unbounded following)
then 'Yes'
else 'No'
end
from tab
Upvotes: 0
Reputation: 86798
Something like this?
(I use a correlated-sub-query to avoid a join returning multiple rows if one row overlaps multiple others.)
SELECT
*,
(
SELECT 1
FROM yourTable AS lookup
WHERE lookup.identifier <> yourTable.identifier -- Don't check overlaps with itself
AND lookup.pID = yourTable.pID
AND lookup.startDate <= yourTable.endDate
AND lookup.endDate >= yourTable.startDate
LIMIT 1
)
FROM
yourTable
Take care with the >=
and <=
, depending on whether you're using inclusive (I hope not) or exclusive (I hope so) endDate
.
Upvotes: 2