Reputation: 13
I have a dataframe like this:
date ID flag
Apr1 1 True
Apr2 2 True
May1 1 True
May1 1 False
May2 1 True
Wanted the cumulative count of ID
in previous days (including that day) where the flag is True
, like this:
date ID flag count
Apr1 1 True 1
Apr2 2 True 1
May1 1 True 2
May1 1 False 2
May2 1 True 3
I tried boolean masking and cumsum()
, but didn't get it to work. Suggestions?
Upvotes: 1
Views: 1535
Reputation: 8631
This seems to be what you need:
df['count'] = df.groupby(by=['ID'])['flag'].cumsum().astype(int)
Output:
date ID flag count
0 Apr1 1 True 1
1 Apr2 2 True 1
2 May1 1 True 2
3 May1 1 False 2
4 May2 1 True 3
Upvotes: 0
Reputation: 323226
That is not cumcount
you are looking for groupby
and cumsum
df.groupby('ID').flag.cumsum().astype(int)
Out[362]:
0 1
1 1
2 2
3 2
4 3
Name: flag, dtype: int32
Upvotes: 1