yang
yang

Reputation: 13

cumulative count in previous rows in pandas data frame

I have a dataframe like this:

 date   ID   flag
 Apr1   1    True
 Apr2   2    True
 May1   1    True
 May1   1    False
 May2   1    True

Wanted the cumulative count of ID in previous days (including that day) where the flag is True, like this:

 date   ID   flag   count
 Apr1   1    True   1
 Apr2   2    True   1
 May1   1    True   2
 May1   1    False  2
 May2   1    True   3

I tried boolean masking and cumsum(), but didn't get it to work. Suggestions?

Upvotes: 1

Views: 1535

Answers (2)

harpan
harpan

Reputation: 8631

This seems to be what you need:

df['count'] = df.groupby(by=['ID'])['flag'].cumsum().astype(int)

Output:

    date    ID  flag    count
0   Apr1    1   True    1
1   Apr2    2   True    1
2   May1    1   True    2
3   May1    1   False   2
4   May2    1   True    3

Upvotes: 0

BENY
BENY

Reputation: 323226

That is not cumcount you are looking for groupby and cumsum

df.groupby('ID').flag.cumsum().astype(int)
Out[362]: 
0    1
1    1
2    2
3    2
4    3
Name: flag, dtype: int32

Upvotes: 1

Related Questions