Ivan R
Ivan R

Reputation: 111

Python Pandas - Count consecutive grouped rows that satisfy condition and reset counter when false

I need to add a new column labelled "counter" to the existing dataframe that will be calculated as shows in the example below:

symbol percentage ??? counter ???
A 11 -1
A 2 0
A 5 1
B 2 0
B 1 1
B 3 2
A 2 2
A 9 -1
A 4 0
B 2 3
B 8 -1
B 7 -1

So the data is grouped per "symbol" and the logic for calculating the "counter" is like this:

I've tried something like this, but it's not good, since the reset is not working:

df['counter'] = np.where(df['percentage'] > 5, -1, df.groupby('symbol').cumcount())

Upvotes: 1

Views: 607

Answers (1)

mozway
mozway

Reputation: 261820

IIUC, you can use a mask and a custom groupby:

m = df['percentage'].gt(5)
group = m.groupby(df['symbol']).apply(lambda s: s.ne(s.shift()).cumsum())

df['count'] = (df
               .groupby(['symbol', group])
               .cumcount()
               .mask(m, -1)
               )

Output:

   symbol  percentage  counter
0       A          11       -1
1       A           2        0
2       A           5        1
3       B           2        0
4       B           1        1
5       B           3        2
6       A           2        2
7       A           9       -1
8       A           4        0
9       B           2        3
10      B           8       -1
11      B           7       -1

Upvotes: 1

Related Questions