Reputation: 227
Consider below table
Table
ActivityId Flag Type
---------- ----- -----
1 N
2 N
3 Y EXT
4 Y
5 Y
6 N
7 Y INT
8 Y
9 N
10 N
11 N
12 Y EXT
13 N
14 N
15 N
16 Y EXT
17 Y
18 Y INT
19 Y
20 Y EXT
21 Y
22 N
23 N
First record has always Flag = N
and then any sequence of Flag = Y
or Flag = N
can exist for next records. Every time flag changes from N
to Y
, the Type
field is either EXT
or INT
. Next Y
records (before next N
) might have Type = EXT
or INT
or NULL and this is not important.
I want to calculate Cycle No
for this sequence of N/Y
. First cycle starts when Flag = N
(always first record has flag = N
) and cycle ends when flag changes to Y
and Type = EXT
. Then, next cycle starts when flag changes to N
and ends when flag becomes Y
and type = EXT
. this is repeated until all records are processed. The result for above table is:
Result
ActivityId Flag Type Cycle No
---------- ----- ----- --------
1 N 1
2 N 1
3 Y EXT 1
4 Y
5 Y
6 N 2
7 Y INT 2
8 Y 2
9 N 2
10 N 2
11 N 2
12 Y EXT 2
13 N 3
14 N 3
15 N 3
16 Y EXT 3
17 Y
18 Y INT
19 Y
20 Y EXT
21 Y
22 N 4
23 N 4
I am using SQL Server 2008 R2 (no LAG
/LEAD
).
Can you please help me find the SQL query to calculate Cycle No
?
Upvotes: 3
Views: 468
Reputation: 12243
If you are happy with recursion, this can be achieved rather simply with a bit of comparison logic to the preceding row when ordered by your ActivityId
:
declare @t table(ActivityId int,Flag nvarchar(1),TypeValue nvarchar(3));
insert into @t values(1 ,'N',null),(2 ,'N',null),(3 ,'Y','EXT'),(4 ,'Y',null),(5 ,'Y',null),(6 ,'N',null),(7 ,'Y','INT'),(8 ,'Y',null),(9 ,'N',null),(10,'N',null),(11,'N',null),(12,'Y','EXT'),(13,'N',null),(14,'N',null),(15,'N',null),(16,'Y','EXT'),(17,'Y',null),(18,'Y','INT'),(19,'Y',null),(20,'Y','EXT'),(21,'Y',null),(22,'N',null),(23,'N',null);
with rn as -- Derived table purely to guarantee incremental row number. If you can guarantee your ActivityId values are incremental start to finish, this isn't required.
( select row_number() over (order by ActivityId) as rn
,ActivityId
,Flag
,TypeValue
from @t
),d as
( select rn -- Recursive CTE that compares the current row to the one previous.
,ActivityId
,Flag
,TypeValue
,cast(1 as decimal(10,5)) as CycleNo
from rn
where rn = 1
union all
select rn.rn
,rn.ActivityId
,rn.Flag
,rn.TypeValue
,cast(
case when d.Flag = 'Y' and d.TypeValue = 'EXT' and d.CycleNo >= 1
then case when rn.Flag = 'N'
then d.CycleNo + 1
else (d.CycleNo + 1) * 0.0001 -- This part keeps track of the cycle number in fractional values, which can be removed by converting the final result to INT.
end
else case when rn.Flag = 'N' and d.CycleNo < 1
then d.CycleNo * 10000
else d.CycleNo
end
end
as decimal(10,5)) as CycleNo
from rn
inner join d
on d.rn = rn.rn - 1
)
select ActivityId
,Flag
,TypeValue
,cast(CycleNo as int) as CycleNo
from d
order by ActivityId;
Output:
+------------+------+-----------+---------+
| ActivityId | Flag | TypeValue | CycleNo |
+------------+------+-----------+---------+
| 1 | N | NULL | 1 |
| 2 | N | NULL | 1 |
| 3 | Y | EXT | 1 |
| 4 | Y | NULL | 0 |
| 5 | Y | NULL | 0 |
| 6 | N | NULL | 2 |
| 7 | Y | INT | 2 |
| 8 | Y | NULL | 2 |
| 9 | N | NULL | 2 |
| 10 | N | NULL | 2 |
| 11 | N | NULL | 2 |
| 12 | Y | EXT | 2 |
| 13 | N | NULL | 3 |
| 14 | N | NULL | 3 |
| 15 | N | NULL | 3 |
| 16 | Y | EXT | 3 |
| 17 | Y | NULL | 0 |
| 18 | Y | INT | 0 |
| 19 | Y | NULL | 0 |
| 20 | Y | EXT | 0 |
| 21 | Y | NULL | 0 |
| 22 | N | NULL | 4 |
| 23 | N | NULL | 4 |
+------------+------+-----------+---------+
Upvotes: 1
Reputation: 1440
I've got a solution, its not pretty, but through stepwise refinement I get to your result:
The solution is in three steps
First i select out all start and end cycle events:
with tab as
(select * from (values
(1,'N',''),(2,'N',''),(3,'Y','EXT'),(4,'Y','')
,(5,'Y',''),(6,'N',''),(7,'Y','INT'),(8,'Y','')
,(9,'N',''),(10,'N',''),(11,'N',''),(12,'Y','EXT')
,(13,'N',''),(14,'N',''),(15,'N',''),(16,'Y','EXT')
,(17,'Y',''),(18,'Y','INT'),(19,'Y',''),(20,'Y','EXT')
,(21,'Y',''),(22,'N',''),(23,'N','')) a(ActivityId,Flag,[Type]))
,CTE1 as
( select
ROW_NUMBER() over (order by t1.ActivityId) rn
,t1.ActivityId
,case when t1.Flag='N' then 'Start' else 'End' end Cycle
from tab t1
where t1.Flag='N' or (t1.Flag='Y' and t1.[Type]='Ext')
)
select * from cte1
This returns
rn ActivityId Cycle
1 1 Start
2 2 Start
3 3 End
4 6 Start
5 9 Start
6 10 Start
7 11 Start
8 12 End
9 13 Start
10 14 Start
11 15 Start
12 16 End
13 20 End
14 22 Start
15 23 Start
The problem is now that while we are sure of when the cycle ends, that is when Flag is N and Type is Ext, we are not sure when the cycle starts. Row 1 and 2 both denote a possible start event. But luckily we can see that only a start event following an End event is to be counted. Since we do not have lag or lead we must join the CTE with itself:
,CTE2 as
(
select ROW_NUMBER() over (order by a1.activityid) rn
,a1.ActivityId
,a1.Cycle
,a2.Cycle PrevCycle
from CTE1 a1 left join CTE1 a2 on a1.rn=a2.rn+1
where
a2.Cycle is null -- First Cycle
or
( a2.Cycle is not null
and
(
(a1.Cycle='End' and a2.Cycle='Start') -- End of cycle
or
(a1.Cycle='Start'
and a2.Cycle='End') -- next cycles
)
)
)
select * from cte2
This returns
rn ActivityId Cycle PrevCycle
1 1 Start NULL
2 3 End Start
3 6 Start End
4 12 End Start
5 13 Start End
6 16 End Start
7 22 Start End
I Select the first start event - since we always start with one, and then keep the END events that follow a start event. Finally we only keep the rest of the start events if the previous event was an End event.
Now we can find the start and end of each cycle, and number them:
,cte3 as
(
select ROW_NUMBER() over (order by b1.ActivityId) CycleNumber
,b1.ActivityId StartId,b2.ActivityId EndId
from cte2 b1 left join cte2 b2
on b1.rn=b2.rn-1
where b1.Cycle='Start'
)
select * from cte3
Which gives us what we need:
CycleNumber StartId EndId
1 1 3
2 6 12
3 13 16
4 22 NULL
Now we just need to join this back on our table:
select
a.ActivityId,a.Flag,a.[Type],CycleNumber
from tab a
left join cte3 b on a.ActivityId between b.StartId and isnull(b.EndId,a.ActivityId)
This gives the result you were looking for.
This is just a quick and dirty solution, perhaps with a little TLC you can pretty it up, and reduce the number of steps.
The full solution is here:
with tab as
(select * from (values
(1,'N',''),(2,'N',''),(3,'Y','EXT'),(4,'Y','')
,(5,'Y',''),(6,'N',''),(7,'Y','INT'),(8,'Y','')
,(9,'N',''),(10,'N',''),(11,'N',''),(12,'Y','EXT')
,(13,'N',''),(14,'N',''),(15,'N',''),(16,'Y','EXT')
,(17,'Y',''),(18,'Y','INT'),(19,'Y',''),(20,'Y','EXT')
,(21,'Y',''),(22,'N',''),(23,'N','')) a(ActivityId,Flag,[Type]))
,CTE1 as
( select
ROW_NUMBER() over (order by t1.ActivityId) rn
,t1.ActivityId
,case when t1.Flag='N' then 'Start' else 'End' end Cycle
from tab t1
where t1.Flag='N' or (t1.Flag='Y' and t1.[Type]='Ext')
)
,CTE2 as
(
select ROW_NUMBER() over (order by a1.activityid) rn
,a1.ActivityId
,a1.Cycle
,a2.Cycle PrevCycle
from CTE1 a1 left join CTE1 a2 on a1.rn=a2.rn+1
where
a2.Cycle is null -- First Cycle
or
( a2.Cycle is not null
and
(
a1.Cycle='End' -- End of cycle
or
(a1.Cycle='Start'
and a2.Cycle='End') -- next cycles
)
)
)
,cte3 as
(
select ROW_NUMBER() over (order by b1.ActivityId) CycleNumber
,b1.ActivityId StartId,b2.ActivityId EndId
from cte2 b1 left join cte2 b2
on b1.rn=b2.rn-1
where b1.Cycle='Start'
)
select
a.ActivityId,a.Flag,a.[Type],CycleNumber
from tab a
left join cte3 b on a.ActivityId between b.StartId and isnull(b.EndId,a.ActivityId)
Upvotes: 3