espogian
espogian

Reputation: 607

Cumulative count of values with grouping using Pandas

I have the following DataFrame:

>>>> df = pd.DataFrame(data={
            'type': ['A', 'A', 'A', 'B', 'B', 'B', 'C', 'C', 'C'],
            'value': [0, 2, 3, 4, 0, 3, 2, 3, 0]})

>>> df
  type  value
0    A      0
1    A      2
2    A      3
3    B      4
4    B      0
5    B      3
6    C      2
7    C      3
8    C      0

What I need to accomplish is the following: for each type, trace the cumulative count of non-zero values but starting from zero each time a 0-value is encountered.

  type  value  cumcount
0    A      0         NaN
1    A      2         1
2    A      3         2
3    B      4         1
4    B      0         NaN
5    B      3         1
6    C      2         1
7    C      3         2
8    C      0         NaN

Upvotes: 3

Views: 226

Answers (1)

jezrael
jezrael

Reputation: 862431

Idea is create consecutive groups and filter out non 0 values, last assign to new column with filter:

m = df['value'].eq(0)
g = m.ne(m.shift()).cumsum()[~m]

df.loc[~m, 'new'] = df.groupby(['type',g]).cumcount().add(1)
print (df)
  type  value  new
0    A      0  NaN
1    A      2  1.0
2    A      3  2.0
3    B      4  1.0
4    B      0  NaN
5    B      3  1.0
6    C      2  1.0
7    C      3  2.0
8    C      0  NaN

For pandas 0.24+ is possible use Nullable integer data type:

df['new'] = df['new'].astype('Int64')
print (df)
  type  value  new
0    A      0  NaN
1    A      2    1
2    A      3    2
3    B      4    1
4    B      0  NaN
5    B      3    1
6    C      2    1
7    C      3    2
8    C      0  NaN

Upvotes: 6

Related Questions