Reputation: 40327
I'm trying to find an efficient way to merge/combine contiguous intervals defined by a category. For example, from this source table:
TYPE START END CATEGORY
---- ----- ---- --------
X 0 5 A
X 5 10 B
X 10 15 B
X 15 20 C
X 20 25 C
Y 0 6 A
Y 6 12 A
Y 12 20 B
Y 20 22 B
Y 24 27 B
Y 27 30 B
Y 35 40 C
Query output should be:
TYPE START END CATEGORY
---- ----- ---- --------
X 0 5 A
X 5 15 B
X 15 25 C
Y 0 12 A
Y 12 22 B
Y 24 30 B
Y 35 40 C
Note that for type Y there's a gap between 22 and 24, so a new combined interval is required.
Using SQL Server 2008R2. Data volume is in the thousands of records, so not too huge.
It feels like this is a job for recursive CTEs, but I'm having trouble wrapping my head around it. Any help would be appreciated!
Upvotes: 2
Views: 266
Reputation: 32737
This is commonly referred to as a "gaps and islands" problem. Here's my stab at it:
with cte as (
select * from (values
('X', 0, 5, 'A'),
('X', 5, 10, 'B'),
('X', 10, 15, 'B'),
('X', 15, 20, 'C'),
('X', 20, 25, 'C'),
('Y', 0, 6, 'A'),
('Y', 6, 12, 'A'),
('Y', 12, 20, 'B'),
('Y', 20, 22, 'B'),
('Y', 24, 27, 'B'),
('Y', 27, 30, 'B'),
('Y', 35, 40, 'C')
) as x( [Type], [Start], [End], [Category] )
), with_grp as (
select *, n.Number - dense_rank() over (partition by [Type], [Category] order by n.Number ) as grp
from cte
join dbadmin.dbo.Numbers as n
on n.Number >= cte.Start
and n.Number <= cte.[End]
)
select [Type], [Category], min([Start]), max([End])
from with_grp
group by [Type], [Category], [grp]
order by [Type], [Category], min([Start])
Where dbadmin.dbo.Numbers is a tally table (a table with a single column and the rows being the numbers 1-(some large-ish number)). To uncover the magic, do a raw select from the with_grp
cte. But that should do it.
Upvotes: 3