Stacey
Stacey

Reputation: 5107

Create a peak column in dataframe

I have a dataframe df which looks like:

date               perf   cumulative_perf         
29/11/2005  36528.11368       36528.11368      
30/11/2005  29034.77194       65562.88563      
01/12/2005  47923.50416       113486.3898      
02/12/2005  52740.69331       166227.0831      
05/12/2005  -3185.762137      163041.0321      
06/12/2005  -25084.55935      137956.7616      
07/12/2005  3551.701267       141508.4629      
08/12/2005  22039.83875       163548.3016      
09/12/2005  58217.58428       221765.8859      
12/12/2005  -2906.995835      218858.8901      
13/12/2005  -31979.02878      186879.8613

I am trying to add a column called peak that will look at the cumulative_perf for the date and compare it to the peak value from yesterday and return to the peak column the max of the two. The result output would hopefully look like:

date               perf    cumulative_perf        peak
29/11/2005  36528.11368   36528.11368      36528.11368
30/11/2005  29034.77194   65562.88563      65562.88563
01/12/2005  47923.50416   113486.3898      113486.3898
02/12/2005  52740.69331   166227.0831      166227.0831
05/12/2005  -3185.762137  163041.0321      166227.0831
06/12/2005  -25084.55935  137956.7616      166227.0831
07/12/2005  3551.701267   141508.4629      166227.0831
08/12/2005  22039.83875   163548.3016      166227.0831
09/12/2005  58217.58428   221765.8859      221765.8859
12/12/2005  -2906.995835  218858.8901      221765.8859
13/12/2005  -31979.02878  186879.8613      221765.8859

How can I reference the the cumulative_perf column and the peak value for the day before to make a decision of which value to return to the peak column for the current day?

Upvotes: 2

Views: 289

Answers (2)

BENY
BENY

Reputation: 323326

Ok, we are using rolling and max

df['cumulative_perf'].rolling(window=len(df), min_periods=1).max()
Out[487]: 
0      36528.11368
1      65562.88563
2     113486.38980
3     166227.08310
4     166227.08310
5     166227.08310
6     166227.08310
7     166227.08310
8     221765.88590
9     221765.88590
10    221765.88590
Name: cumulative_perf, dtype: float64

Upvotes: 1

jezrael
jezrael

Reputation: 863216

I think you need Series.cummax:

df['peak'] = df['cumulative_perf'].cummax()
print (df)
          date          perf  cumulative_perf          peak
0   29/11/2005  36528.113680      36528.11368   36528.11368
1   30/11/2005  29034.771940      65562.88563   65562.88563
2   01/12/2005  47923.504160     113486.38980  113486.38980
3   02/12/2005  52740.693310     166227.08310  166227.08310
4   05/12/2005  -3185.762137     163041.03210  166227.08310
5   06/12/2005 -25084.559350     137956.76160  166227.08310
6   07/12/2005   3551.701267     141508.46290  166227.08310
7   08/12/2005  22039.838750     163548.30160  166227.08310
8   09/12/2005  58217.584280     221765.88590  221765.88590
9   12/12/2005  -2906.995835     218858.89010  221765.88590
10  13/12/2005 -31979.028780     186879.86130  221765.88590

Or for numpy solution:

df['peak'] = np.maximum.accumulate(df['cumulative_perf'])
print (df)
          date          perf  cumulative_perf          peak
0   29/11/2005  36528.113680      36528.11368   36528.11368
1   30/11/2005  29034.771940      65562.88563   65562.88563
2   01/12/2005  47923.504160     113486.38980  113486.38980
3   02/12/2005  52740.693310     166227.08310  166227.08310
4   05/12/2005  -3185.762137     163041.03210  166227.08310
5   06/12/2005 -25084.559350     137956.76160  166227.08310
6   07/12/2005   3551.701267     141508.46290  166227.08310
7   08/12/2005  22039.838750     163548.30160  166227.08310
8   09/12/2005  58217.584280     221765.88590  221765.88590
9   12/12/2005  -2906.995835     218858.89010  221765.88590
10  13/12/2005 -31979.028780     186879.86130  221765.88590

Upvotes: 3

Related Questions