Reputation: 607
I have the following DataFrame:
>>>> df = pd.DataFrame(data={
'type': ['A', 'A', 'A', 'B', 'B', 'B', 'C', 'C', 'C'],
'value': [0, 2, 3, 4, 0, 3, 2, 3, 0]})
>>> df
type value
0 A 0
1 A 2
2 A 3
3 B 4
4 B 0
5 B 3
6 C 2
7 C 3
8 C 0
What I need to accomplish is the following: for each type, trace the cumulative count of non-zero values but starting from zero each time a 0-value is encountered.
type value cumcount
0 A 0 NaN
1 A 2 1
2 A 3 2
3 B 4 1
4 B 0 NaN
5 B 3 1
6 C 2 1
7 C 3 2
8 C 0 NaN
Upvotes: 3
Views: 226
Reputation: 862431
Idea is create consecutive groups and filter out non 0
values, last assign to new column with filter:
m = df['value'].eq(0)
g = m.ne(m.shift()).cumsum()[~m]
df.loc[~m, 'new'] = df.groupby(['type',g]).cumcount().add(1)
print (df)
type value new
0 A 0 NaN
1 A 2 1.0
2 A 3 2.0
3 B 4 1.0
4 B 0 NaN
5 B 3 1.0
6 C 2 1.0
7 C 3 2.0
8 C 0 NaN
For pandas 0.24+ is possible use Nullable integer data type:
df['new'] = df['new'].astype('Int64')
print (df)
type value new
0 A 0 NaN
1 A 2 1
2 A 3 2
3 B 4 1
4 B 0 NaN
5 B 3 1
6 C 2 1
7 C 3 2
8 C 0 NaN
Upvotes: 6