Joey B
Joey B

Reputation: 125

Enumerate a value until reaching a specific value in pandas

I have a data frame that looks like the following

df = pd.DataFrame({'id':[1,2,3,4,5,6,7,8,9,10],'value':[1,2.5,1.1,1.4,1.5,1,1.5,3,1,1.6]})
df['value_at_1'] = np.where(df['value'] == 1,1,0)
df
>>>

id   value   value_at_1
1      1          1
2     2.5         0 
3     1.1         0
4     1.4         0
5     1.5         0
6      1          1
7     1.5         0
8      3          0
9      1          1
10    1.6         0  

I would like to make a variable that enumerates over the "value_at_1" variable and increments each time until it hits 1, and then restarts. The result would like like this:

df = pd.DataFrame({'id':[1,2,3,4,5,6,7,8,9,10],
               'value':[1,2.5,1.1,1.4,1.5,1,1.5,3,1,1.6],
               'value_at_1':[1,0,0,0,0,1,0,0,1,0],
               'count_since_1':[0,1,2,3,4,0,1,2,0,1]}).set_index(['value_at_1'])
>>>

id   value   value_at_1   count_since_1
1      1          1            0 
2     2.5         0            1
3     1.1         0            2
4     1.4         0            3
5     1.5         0            4
6      1          1            0
7     1.5         0            1
8      3          0            2
9      1          1            0 
10    1.6         0            1

Could anyone help me manipulate the data in this way? Thanks!

Upvotes: 1

Views: 275

Answers (2)

BENY
BENY

Reputation: 323386

Just want to provide a new way

import pandas as pd 
import numpy as np 
import functools

idx=df.index[df['value_at_1'].eq(1)].values.tolist()+[len(df)]
idx=list(np.diff(idx))
df['count_since_1']=functools.reduce(lambda x,y: x+y,[list(range(y)) for y in idx])
df
Out[945]: 
   id  value  value_at_1  count_since_1
0   1    1.0           1              0
1   2    2.5           0              1
2   3    1.1           0              2
3   4    1.4           0              3
4   5    1.5           0              4
5   6    1.0           1              0
6   7    1.5           0              1
7   8    3.0           0              2
8   9    1.0           1              0
9  10    1.6           0              1

Upvotes: 2

piRSquared
piRSquared

Reputation: 294526

Use cumsum in a groupby to get cumcount

df.assign(
    count_since_1=df.value_at_1.groupby(df.value_at_1.cumsum()).cumcount())

   id  value  value_at_1  count_since_1
0   1    1.0           1              0
1   2    2.5           0              1
2   3    1.1           0              2
3   4    1.4           0              3
4   5    1.5           0              4
5   6    1.0           1              0
6   7    1.5           0              1
7   8    3.0           0              2
8   9    1.0           1              0
9  10    1.6           0              1

Upvotes: 5

Related Questions