Reputation: 113
How to merge records in SQL when the dates are overlapping?
Name | Start date | End Date |
---|---|---|
Bob | 1-mar-2023 | 5-may-2023 |
Bob | 1-may-2023 | 25-may-2023 |
I want to merge records where there is an overlap.
Expected output
Name | start date | end date |
---|---|---|
Bob | 1-mar-2023 | 25-may-2023 |
Upvotes: 1
Views: 70
Reputation: 59175
Adapting here https://dba.stackexchange.com/questions/316064/aggregate-overlapping-date-intervals to Snowflake (thanks Serg).
The trick is really interesting: Have all start dates and end dates in one column, add 1 when you see a start date, subtract 1 when you see an end date. That running counts shows 0 every time there's an interval closed, covering all openings and closings within it - with cte3
grouping starts and ends:
with data(account_id, start_date, end_date) as (
select * from values
('A','2019-06-20','2019-06-29'),
('A','2019-06-25','2019-07-25'),
('A','2019-07-20','2019-08-26'),
('A','2019-12-25','2020-01-25'),
('A','2021-04-27','2021-07-27'),
('A','2021-06-25','2021-07-14'),
('A','2021-07-10','2021-08-14'),
('A','2021-09-10','2021-11-12'),
('B','2019-07-13','2020-07-14'),
('B','2019-06-25','2019-08-26')
),
cte1 AS (
SELECT account_id, start_date the_date, 1 weight
FROM data
UNION ALL
SELECT account_id, end_date, -1
FROM data
),
cte2 AS (
SELECT account_id,
the_date,
SUM(weight) OVER (PARTITION BY account_id
ORDER BY the_date, weight DESC) weight
FROM cte1
),
cte3 AS (
SELECT account_id,
the_date,
SUM(CASE WHEN weight = 0
THEN 1
ELSE 0
END) OVER (PARTITION BY account_id
ORDER BY the_date DESC) group_no
FROM cte2
)
SELECT account_id,
MIN(the_date) start_date,
MAX(the_date) end_date
FROM cte3
GROUP BY account_id, group_no
ORDER BY 1,2
Upvotes: 0