Reputation: 5107
I have a dataframe df which looks like:
date perf cumulative_perf
29/11/2005 36528.11368 36528.11368
30/11/2005 29034.77194 65562.88563
01/12/2005 47923.50416 113486.3898
02/12/2005 52740.69331 166227.0831
05/12/2005 -3185.762137 163041.0321
06/12/2005 -25084.55935 137956.7616
07/12/2005 3551.701267 141508.4629
08/12/2005 22039.83875 163548.3016
09/12/2005 58217.58428 221765.8859
12/12/2005 -2906.995835 218858.8901
13/12/2005 -31979.02878 186879.8613
I am trying to add a column called peak that will look at the cumulative_perf for the date and compare it to the peak value from yesterday and return to the peak column the max of the two. The result output would hopefully look like:
date perf cumulative_perf peak
29/11/2005 36528.11368 36528.11368 36528.11368
30/11/2005 29034.77194 65562.88563 65562.88563
01/12/2005 47923.50416 113486.3898 113486.3898
02/12/2005 52740.69331 166227.0831 166227.0831
05/12/2005 -3185.762137 163041.0321 166227.0831
06/12/2005 -25084.55935 137956.7616 166227.0831
07/12/2005 3551.701267 141508.4629 166227.0831
08/12/2005 22039.83875 163548.3016 166227.0831
09/12/2005 58217.58428 221765.8859 221765.8859
12/12/2005 -2906.995835 218858.8901 221765.8859
13/12/2005 -31979.02878 186879.8613 221765.8859
How can I reference the the cumulative_perf column and the peak value for the day before to make a decision of which value to return to the peak column for the current day?
Upvotes: 2
Views: 289
Reputation: 323326
Ok, we are using rolling
and max
df['cumulative_perf'].rolling(window=len(df), min_periods=1).max()
Out[487]:
0 36528.11368
1 65562.88563
2 113486.38980
3 166227.08310
4 166227.08310
5 166227.08310
6 166227.08310
7 166227.08310
8 221765.88590
9 221765.88590
10 221765.88590
Name: cumulative_perf, dtype: float64
Upvotes: 1
Reputation: 863216
I think you need Series.cummax
:
df['peak'] = df['cumulative_perf'].cummax()
print (df)
date perf cumulative_perf peak
0 29/11/2005 36528.113680 36528.11368 36528.11368
1 30/11/2005 29034.771940 65562.88563 65562.88563
2 01/12/2005 47923.504160 113486.38980 113486.38980
3 02/12/2005 52740.693310 166227.08310 166227.08310
4 05/12/2005 -3185.762137 163041.03210 166227.08310
5 06/12/2005 -25084.559350 137956.76160 166227.08310
6 07/12/2005 3551.701267 141508.46290 166227.08310
7 08/12/2005 22039.838750 163548.30160 166227.08310
8 09/12/2005 58217.584280 221765.88590 221765.88590
9 12/12/2005 -2906.995835 218858.89010 221765.88590
10 13/12/2005 -31979.028780 186879.86130 221765.88590
Or for numpy solution:
df['peak'] = np.maximum.accumulate(df['cumulative_perf'])
print (df)
date perf cumulative_perf peak
0 29/11/2005 36528.113680 36528.11368 36528.11368
1 30/11/2005 29034.771940 65562.88563 65562.88563
2 01/12/2005 47923.504160 113486.38980 113486.38980
3 02/12/2005 52740.693310 166227.08310 166227.08310
4 05/12/2005 -3185.762137 163041.03210 166227.08310
5 06/12/2005 -25084.559350 137956.76160 166227.08310
6 07/12/2005 3551.701267 141508.46290 166227.08310
7 08/12/2005 22039.838750 163548.30160 166227.08310
8 09/12/2005 58217.584280 221765.88590 221765.88590
9 12/12/2005 -2906.995835 218858.89010 221765.88590
10 13/12/2005 -31979.028780 186879.86130 221765.88590
Upvotes: 3