Reputation: 111
Assume, I have a df with some integers:
mdf = pd.DataFrame(np.random.randint(0,100,size=(100, 1)), columns=list('A'))
And condition for them:
mdf['CM'] = mdf['A'] > mdf['A'].shift(5)
What it the most efficient way to count the number of rows above current since the condition was true?
It should be like so:
def some_function(df):
"""
Here is calculated nearest row index from current where the condition was met
"""
return integer_value of None
mdf['SINCE'] = some_function(mdf)
UPD. The df should look so:
Idx A CM SINCE Formula
0 23 False None
1 55 False None
2 48 False None
3 17 False None
4 24 True 0 4-4
5 30 False 1 5-4
6 99 True 0 6-6
7 11 False 1 7-6
8 47 True 0 8-8
9 25 False 1 9-8
10 78 False 2 10-8
11 40 True 0 11-11
12 99 True 0 12-12
13 7 False 1 13-12
14 6 False 2 14-12
15 64 True 0 15-15
16 62 False 1 16-15
17 39 True 0 17-17
18 41 True 0 18-18
19 28 False 1 19-18
Upvotes: 1
Views: 478
Reputation: 862611
Use:
s = mdf['CM'].cumsum()
mdf["SINCE"] = mdf[s.gt(0)].groupby(s).cumcount()
print (mdf)
A CM SINCE Formula
Idx
0 23 False NaN NaN
1 55 False NaN NaN
2 48 False NaN NaN
3 17 False NaN NaN
4 24 True 0.0 4-4
5 30 False 1.0 5-4
6 99 True 0.0 6-6
7 11 False 1.0 7-6
8 47 True 0.0 8-8
9 25 False 1.0 9-8
10 78 False 2.0 10-8
11 40 True 0.0 11-11
12 99 True 0.0 12-12
13 7 False 1.0 13-12
14 6 False 2.0 14-12
15 64 True 0.0 15-15
16 62 False 1.0 16-15
17 39 True 0.0 17-17
18 41 True 0.0 18-18
19 28 False 1.0 19-18
Upvotes: 1
Reputation: 18306
You can try with cumulative sum:
cs = mdf.CM.cumsum()
mdf["SINCE"] = mdf.groupby(cs).cumcount()
# Nonify the values up until first True
mdf.loc[cs == 0, "SINCE"] = np.nan
to get (for 30 sample data):
A CM SINCE
0 79 False NaN
1 46 False NaN
2 59 False NaN
3 31 False NaN
4 72 False NaN
5 34 False NaN
6 42 False NaN
7 40 False NaN
8 82 True 0.0
9 16 False 1.0
10 9 False 2.0
11 45 True 0.0
12 68 True 0.0
13 48 False 1.0
14 20 True 0.0
15 9 False 1.0
16 14 False 2.0
17 91 True 0.0
18 92 True 0.0
19 0 False 1.0
20 93 True 0.0
21 1 False 1.0
22 60 False 2.0
23 0 False 3.0
24 93 True 0.0
25 85 False 1.0
26 52 True 0.0
27 32 False 1.0
28 65 True 0.0
29 85 False 1.0
Upvotes: 1