Reputation: 79
I have a input table as this:
date ID Flag
01.01 A 0
01.02 A 0
01.03 A 1
01.04 A 1
01.05 A 1
01.06 A 0
01.07 A 1
01.08 A 1
01.09 A 0
01.01 B 1
01.02 B 0
01.03 B 1
01.04 B 1
01.05 B 1
01.06 B 1
01.07 B 1
01.08 B 0
01.09 B 0
I would like to calculate the cumulative sum of the flag for each, but the calculation should be refreshed if the flag is equal to 0 once again.
So, the output table should be:
date ID Flag CUMSUM
01.01 A 0 0
01.02 A 0 0
01.03 A 1 1
01.04 A 1 2
01.05 A 1 3
01.06 A 0 0
01.07 A 1 1
01.08 A 1 2
01.09 A 0 0
01.01 B 1 1
01.02 B 0 0
01.03 B 1 1
01.04 B 1 2
01.05 B 1 3
01.06 B 1 4
01.07 B 1 5
01.08 B 0 0
01.09 B 0 0
So, any suggestion to solve this issue? I tried with this:
sum(FLAG) over (partition by ID order by date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
But i have no idea how can I "refresh" the cumulative calculation, any idea?
Thanks in advance!
Upvotes: 2
Views: 617
Reputation: 1269633
You need to assign a grouping. The grouping is the number of 0
s before each row. Then use that for partitioning:
select t.*,
(case when flag = 1
then sum(flag) over (partition by id, grp order by date)
else 0
end) as cumulative_sum
from (select t.*,
sum(case when flag = 0 then 1 else 0 end) over (partition by id order by date) as grp
from t
) t;
Note: The window frame specification (rows between . . .
) is not needed assuming that the dates are unique (as in your sample data). So, I just left it out.
Also, you can simplify the calculation in the subquery to:
sum(1 - flag = 0) over (partition by id order by date) as grp
Upvotes: 2