Akash Kumar
Akash Kumar

Reputation: 50

How to identify overlapping rows

I need to identify overlapping rows. Below is the table and I need the column overlapping:

| identifier    | status    | startDate     | endDate       | pID   | OVERLAPPING   |
|------------   |---------- |------------   |------------   |-----  |-------------  |
| A             | Approved  | 2020-10-01    | 2020-10-07    | x1    | Yes           |
| A             | Approved  | 2020-10-01    | 2020-10-07    | x2    | No            |
| A             | Approved  | 2020-10-01    | 2020-10-07    | x3    | Yes           |
| A             | Approved  | 2020-10-01    | 2020-10-07    | x4    | No            |
| B             | Approved  | 2020-10-10    | 2020-10-12    | x2    | No            |
| B             | Approved  | 2020-10-10    | 2020-10-12    | x5    | No            |
| C             | Rejected  | 2020-10-05    | 2020-10-06    | x3    | No            |
| C             | Rejected  | 2020-10-05    | 2020-10-06    | x7    | No            |
| C             | Rejected  | 2020-10-05    | 2020-10-06    | x8    | No            |
| C             | Rejected  | 2020-10-05    | 2020-10-06    | x9    | No            |
| D             | Approved  | 2020-10-01    | 2020-10-07    | x5    | No            |
| D             | Approved  | 2020-10-01    | 2020-10-07    | x1    | Yes           |
| E             | Approved  | 2020-10-03    | 2020-10-04    | x3    | Yes           |
  1. Since C is in rejected status anything with C identifier will not be counted for finding overlapped rows. So even tough x3 in C is overlapping with x3 in A, it will not be counted as overlapped.
  2. x1 is overlapping in A and D. So both rows will be counted as overlapped.
  3. x3 in A is overlapping with x3 in E as the start date and end date of E is within the period of A.
  4. x5 in B and D are not overlapping as the dates for both B and D are not overlapping.

I am able to get this if start date and end dates are same by creating a column which is combination of start date, end date and pID and then counting this field for all rows. If it is more than 1 then I am marking overlapping. But this is not covering scenario of x3 where start date and date are not same but still overlapping for some duration.

Upvotes: 1

Views: 252

Answers (3)

marcothesane
marcothesane

Reputation: 6749

Try this one (I use a Boolean with true and false, though ...

WITH
input(identifier,status,startDate,endDate,pID,OVERLAPPING) AS (
          SELECT 'A','Approved',DATE '2020-10-01',DATE '2020-10-07','x1','Yes'
UNION ALL SELECT 'A','Approved',DATE '2020-10-01',DATE '2020-10-07','x2','No'
UNION ALL SELECT 'A','Approved',DATE '2020-10-01',DATE '2020-10-07','x3','Yes'
UNION ALL SELECT 'A','Approved',DATE '2020-10-01',DATE '2020-10-07','x4','No'
UNION ALL SELECT 'B','Approved',DATE '2020-10-10',DATE '2020-10-12','x2','No'
UNION ALL SELECT 'B','Approved',DATE '2020-10-10',DATE '2020-10-12','x5','No'
UNION ALL SELECT 'C','Rejected',DATE '2020-10-05',DATE '2020-10-06','x3','No'
UNION ALL SELECT 'C','Rejected',DATE '2020-10-05',DATE '2020-10-06','x7','No'
UNION ALL SELECT 'C','Rejected',DATE '2020-10-05',DATE '2020-10-06','x8','No'
UNION ALL SELECT 'C','Rejected',DATE '2020-10-05',DATE '2020-10-06','x9','No'
UNION ALL SELECT 'D','Approved',DATE '2020-10-01',DATE '2020-10-07','x5','No'
UNION ALL SELECT 'D','Approved',DATE '2020-10-01',DATE '2020-10-07','x1','Yes'
UNION ALL SELECT 'E','Approved',DATE '2020-10-03',DATE '2020-10-04','x3','Yes'
)
SELECT 
  *
, status = 'Approved'
  AND (
     COALESCE(LAG(enddate)    OVER(w) ,'0001-01-01')> startdate
  OR COALESCE(LEAD(startdate) OVER(w) ,'9999-12-31')< enddate
  ) AS overlap
FROM input
WINDOW w AS (PARTITION BY pid ORDER BY startdate)
ORDER BY
  identifier
, startdate
;
-- out identifier|status  |startDate |endDate   |pID|OVERLAPPING|overlap
-- out A         |Approved|2020-10-01|2020-10-07|x3 |Yes        |true
-- out A         |Approved|2020-10-01|2020-10-07|x2 |No         |false
-- out A         |Approved|2020-10-01|2020-10-07|x4 |No         |false
-- out A         |Approved|2020-10-01|2020-10-07|x1 |Yes        |true
-- out B         |Approved|2020-10-10|2020-10-12|x5 |No         |false
-- out B         |Approved|2020-10-10|2020-10-12|x2 |No         |false
-- out C         |Rejected|2020-10-05|2020-10-06|x3 |No         |false
-- out C         |Rejected|2020-10-05|2020-10-06|x8 |No         |false
-- out C         |Rejected|2020-10-05|2020-10-06|x7 |No         |false
-- out C         |Rejected|2020-10-05|2020-10-06|x9 |No         |false
-- out D         |Approved|2020-10-01|2020-10-07|x1 |Yes        |true
-- out D         |Approved|2020-10-01|2020-10-07|x5 |No         |false
-- out E         |Approved|2020-10-03|2020-10-04|x3 |Yes        |true

Upvotes: 0

dnoeth
dnoeth

Reputation: 60513

Based on your narrative this seems to match your logic:

select *,
   case 
     when status = 'Rejected' then 'No'
       -- previous row overlaps
     when startDate < -- maybe <=
          max(case when status <> 'Rejected' then endDate end)
          over (partition by pID
                order by startDate, endDate desc
                rows between unbounded preceding and 1 preceding)
       -- next row overlaps
       or endDate > -- maybe >=
          min(case when status <> 'Rejected' then startDate end)
          over (partition by pID
                order by startDate, endDate desc
                rows between 1 following and unbounded following)
     then 'Yes'
     else 'No'
   end 
from tab

Upvotes: 0

MatBailie
MatBailie

Reputation: 86798

Something like this?

(I use a correlated-sub-query to avoid a join returning multiple rows if one row overlaps multiple others.)

SELECT
  *,
  (
    SELECT 1
      FROM yourTable AS lookup
     WHERE lookup.identifier <> yourTable.identifier  -- Don't check overlaps with itself
       AND lookup.pID         = yourTable.pID
       AND lookup.startDate  <= yourTable.endDate
       AND lookup.endDate    >= yourTable.startDate
     LIMIT 1
  )
FROM
  yourTable

Take care with the >= and <=, depending on whether you're using inclusive (I hope not) or exclusive (I hope so) endDate.

Upvotes: 2

Related Questions