Reputation: 89
I have a dataframe similar to the below:
df = pd.DataFrame({'ticker':['A', 'A', 'A', 'A', 'A',
'B', 'B', 'B', 'B', 'B'],
'date':['2020-05-01 09:30:00', '2020-05-01 09:31:00', '2020-05-01 09:32:00', '2020-05-01 09:33:00', '2020-05-01 09:34:00',
'2020-05-01 09:30:00', '2020-05-01 09:31:00', '2020-05-01 09:32:00', '2020-05-01 09:33:00', '2020-05-01 09:34:00'],
'datetime':['2020-05-01 09:30:00', '2020-05-01 09:31:00', '2020-05-01 09:32:00', '2020-05-01 09:33:00', '2020-05-01 09:34:00',
'2020-05-01 09:30:00', '2020-05-01 09:31:00', '2020-05-01 09:32:00', '2020-05-01 09:33:00', '2020-05-01 09:34:00'],
'low':[10.00, 11.00, 12.00, 11.00, 11.50,
15.00, 14.00, 13.00, 12.00, 12.50],
'high':[10.25, 11.25, 12.25, 11.25, 11.75,
15.25, 14.25, 13.25, 12.25, 12.75],
'close':[10.20, 11.20, 12.20, 11.20, 11.70,
15.20, 14.20, 13.20, 12.20, 12.75]})
df['date'] = pd.to_datetime(df['date'], format='%Y-%m-%d %H:%M')
df['datetime'] = pd.to_datetime(df['datetime'], format='%Y-%m-%d %H:%M')
df = df.set_index(['ticker', 'date'])
I am trying to figure out how to create columns in my dataframe with the minimum and maximum prices of each ticker at a given point in time. That is, between the first time that day and the current time of the given row.
I've tried various different .groupby and .loc statements, but can't seem to get the right one.
Any help would be greatly appreciated!
Upvotes: 1
Views: 99
Reputation: 5741
You are looking for .expanding()
in combination with .min()
and .max()
, e.g.:
df.groupby('ticker')[['low', 'high', 'close']].expanding().min()
low high close
ticker ticker date
A A 2020-05-01 09:30:00 10.0 10.25 10.2
2020-05-01 09:31:00 10.0 10.25 10.2
2020-05-01 09:32:00 10.0 10.25 10.2
2020-05-01 09:33:00 10.0 10.25 10.2
2020-05-01 09:34:00 10.0 10.25 10.2
B B 2020-05-01 09:30:00 15.0 15.25 15.2
2020-05-01 09:31:00 14.0 14.25 14.2
2020-05-01 09:32:00 13.0 13.25 13.2
2020-05-01 09:33:00 12.0 12.25 12.2
2020-05-01 09:34:00 12.0 12.25 12.2
It is basically like .rolling()
but its length keeps increasing instead of the window being of fixed size.
Upvotes: 2
Reputation: 1085
I am not sure, what you excally want, but I think the following code will work:
df[['date','low']].loc[df['ticker']=='A'].min()
Output:
date 2020-05-01 09:30:00
low 10
You don't neet to set the index to the ticker
and date
Upvotes: 0