Reputation: 2434
My title is awful, because I am not sure how to describe the challenge. I would love an edit if someone can think of a more descriptive title. Hopefully my input/desired output will help explain. Here is some sample input data:
create table #input (
num varchar(10),
code varchar(10),
event_date date
)
insert into #input (num, code, event_date)
values('123456', 'Active', '2007-09-10'),
('123456', 'Active', '2010-09-15'),
('123456', 'Active', '2010-09-24'),
('123456', 'Inactive', '2018-09-17'),
('123456', 'Inactive', '2019-01-01'),
('123456', 'Active', '2019-02-08')
select *
from #input
order by event_date
I want to tag each record for each group of num + code with the same number. However, I want the time periods to stay separate. Here is the desired result:
create table #result (
num varchar(10),
code varchar(10),
event_date date,
tag int
)
insert into #result (num, code, event_date, tag)
values('123456', 'Active', '2007-09-10', 1),
('123456', 'Active', '2010-09-15', 1),
('123456', 'Active', '2010-09-24', 1),
('123456', 'Inactive', '2018-09-17', 2),
('123456', 'Inactive', '2019-01-01', 2),
('123456', 'Active', '2019-02-08', 3)
select *
from #result
order by event_date
Obviously normal window partitions like this...
select *, row_number() over(partition by num, code order by event_date) rn
from #input
order by event_date
...don't work, because there is no field on which to partition that would split the two "Active" groups (two groups, because they happen during two time frames). How would I reach my desired result? I have a hunch that a series of lag()
and lead()
functions might work, but I couldn't get anywhere meaningful.
Alternatively, how would I achieve the results so the categories overlap by one?
create table #result_new (
num varchar(10),
code varchar(10),
event_date date,
tag int
)
insert into #result (num, code, event_date, tag)
values('123456', 'Active', '2007-09-10', 1),
('123456', 'Active', '2010-09-15', 1),
('123456', 'Active', '2010-09-24', 1),
('123456', 'Inactive', '2018-09-17', 1),
('123456', 'Inactive', '2019-01-01', 2),
('123456', 'Active', '2019-02-08', 2)
select *
from #result_new
order by event_date
Upvotes: 3
Views: 35
Reputation: 95830
LAG
gets your half way there, but not the whole way. You can use LAG
to check the value of the last row, and create (what I have called) a switch. You can then use a SUM
window function, with a ROWs BETWEEN
clause to get the value for tag
:
WITH CTE AS(
SELECT num,
code,
event_date,
CASE WHEN code = LAG(code) OVER (PARTITION BY num ORDER BY event_date) THEN 0 ELSE 1 END AS Switch
FROM #input)
SELECT num,
code,
event_date,
SUM(Switch) OVER (PARTITION BY num ORDER BY event_date
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS tag
FROM CTE;
Upvotes: 3