Reputation: 13
I'm trying to get the current max value of a column in pandas. For example, I want to take the column [ask]
and create a new column [high_of_day]
to show what the maximum value of the ask column has been up to the this point, and to keep repeating that max value in the [high_of_day]
column until a new value in the ask column that is greater appears.
Data Input
data = [['9:00',1,0],['10:00',2,0],['11:00',3,0],['12:00',4,0],['13:00',2,0],['14:00',5,0]]
df3 = pd.DataFrame(data, columns=['DateTime','Ask','High_of_Day'],dtype=float)
Output
DateTime Ask High_of_Day
0 9:00 1.0 0.0
1 10:00 2.0 0.0
2 11:00 3.0 0.0
3 12:00 4.0 0.0
4 13:00 2.0 0.0
5 14:00 5.0 0.0
I have tried using a wide range of loops but can't seem to get it right.
The desired outcome I am trying to get is:
DateTime Ask High_of_Day
0 9:00 1.0 1.0
1 10:00 2.0 2.0
2 11:00 3.0 3.0
3 12:00 4.0 4.0
4 13:00 2.0 4.0
5 14:00 5.0 5.0
Any help on getting the right algorithm would be extremely appreciated, thanks!
Upvotes: 1
Views: 715
Reputation: 402333
Option 1
pd.Series.cummax
s = df3.Ask.cummax()
print(s)
0 1.0
1 2.0
2 3.0
3 4.0
4 4.0
5 5.0
Name: Ask, dtype: float64
df3['High_of_Day'] = s
print(df3)
DateTime Ask High_of_Day
0 9:00 1.0 1.0
1 10:00 2.0 2.0
2 11:00 3.0 3.0
3 12:00 4.0 4.0
4 13:00 2.0 4.0
5 14:00 5.0 5.0
Option 2
np.maximum.accumulate
df3['High_of_Day'] = np.maximum.accumulate(df3.Ask)
print(df3)
DateTime Ask High_of_Day
0 9:00 1.0 1.0
1 10:00 2.0 2.0
2 11:00 3.0 3.0
3 12:00 4.0 4.0
4 13:00 2.0 4.0
5 14:00 5.0 5.0
Upvotes: 4