Reputation: 51
I have this dataset where I have a time-series with in the YYYYMM format. I have two columns which basically as true/false flags. I would like to add two extra columns based on these true/false flags that retrieves the current range:
Default Cure
201301 0 NULL
201302 0 NULL
201303 0 NULL
201304 1 NULL
201305 1 NULL
201306 1 NULL
201307 1 NULL
201308 NULL 0
201309 NULL 0
201310 NULL 1
201311 0 NULL
201312 0 NULL
201401 0 NULL
201402 0 NULL
201403 1 NULL
201404 1 NULL
201405 0 NULL
201406 0 NULL
201407 NULL 1
201408 NULL 0
201409 NULL 0
201410 NULL 0
201411 NULL 0
201412 NULL 0
I this dataset you can see the Default column being set to 1 for the periods 201304, 05, 06, 07 and the Cure column is set to 1 in the period 201310.
This basically means the Default timeseries is valid from period 201304 until period 201310. Ultimately I would like to generate the following set:
Default Cure DefaultPeriod CurePeriod
201301 0 NULL NULL NULL
201302 0 NULL NULL NULL
201303 0 NULL NULL NULL
201304 1 NULL 201304 201310
201305 1 NULL 201304 201310
201306 1 NULL 201304 201310
201307 1 NULL 201304 201310
201308 NULL 0 201304 201310
201309 NULL 0 201304 201310
201310 NULL 1 201304 201310
201311 0 NULL NULL NULL
201312 0 NULL NULL NULL
201401 0 NULL NULL NULL
201402 0 NULL NULL NULL
201403 1 NULL 201403 201407
201404 1 NULL 201403 201407
201405 0 NULL 201403 201407
201406 0 NULL 201403 201407
201407 NULL 1 201403 201407
201408 NULL 0 NULL NULL
201409 NULL 0 NULL NULL
201410 NULL 0 NULL NULL
201411 NULL 0 NULL NULL
201412 NULL 0 NULL NULL
Multiple ranges can occur but they cannot overlap. How would I go about achieving this. I have tried to do all sorts of min/max period join on the same table, but I can't seem to find a working solution.
Upvotes: 0
Views: 66
Reputation: 2507
This was a real thinker :)
Basically I am dividing up the data on the "Cure" dates (c1), numbering each group(c2), then looking for mins and maxes within each group (c3 C4), then applying some logic to filter out the rows that come before the min.
declare @t table
(
[Month] varchar(6),
[Default] bit,
[Cure] bit
);
insert into @t values('201301', 0, NULL);
insert into @t values('201302', 0, NULL);
insert into @t values('201303', 0, NULL);
insert into @t values('201304', 1, NULL);
insert into @t values('201305', 1, NULL);
insert into @t values('201306', 1, NULL);
insert into @t values('201307', 1, NULL);
insert into @t values('201308', NULL, 0);
insert into @t values('201309', NULL, 0);
insert into @t values('201310', NULL, 1);
insert into @t values('201311', 0, NULL);
insert into @t values('201312', 0, NULL);
insert into @t values('201401', 0, NULL);
insert into @t values('201402', 0, NULL);
insert into @t values('201403', 1, NULL);
insert into @t values('201404', 1, NULL);
insert into @t values('201405', 0, NULL);
insert into @t values('201406', 0, NULL);
insert into @t values('201407', NULL, 1);
insert into @t values('201408', NULL, 0);
insert into @t values('201409', NULL, 0);
insert into @t values('201410', NULL, 0);
insert into @t values('201411', NULL, 0);
insert into @t values('201412', NULL, 0);
with c1 as
(
select min([Month]) [Month], 1 x from @t
union all
select [Month],1 from @t
where Cure = 1
),
c2 as
(
select t.[Month],[Default],[Cure],
sum(x) over (order by t.[Month] rows between unbounded preceding and 1 preceding) grp
from @t t
left outer join c1 on c1.[Month] = t.[Month]
),
c3 as
(
select grp, min([Month]) [Month]
from c2
where [Default] = 1
group by grp
),
c4 as
(
select grp, max([Month]) [Month]
from c2
where [Cure] = 1
group by grp
)
select c2.[Month], c2.[Default], c2.[Cure],
case when c2.[Month] >= c3.[Month] then c3.[Month] else null end as DefaultPeriod,
case when c2.[Month] >= c3.[Month] then c4.[Month] else null end as CurePeriod
from c2
left outer join c3 on c2.grp = c3.grp
left outer join c4 on c2.grp = c4.grp
Upvotes: 1