Reputation: 513
I have some stock data in a pandas dataframe that looks like this (index=datetime)
ask bid
datetime
2017-03-01 00:00:07.346 1.33145 1.33135
2017-03-01 00:00:07.498 1.33147 1.33134
2017-03-01 00:00:08.124 1.33147 1.33133
2017-03-01 00:00:08.659 1.33148 1.33134
2017-03-01 00:00:12.104 1.33146 1.33134
2017-03-01 00:00:12.858 1.33147 1.33134
2017-03-01 00:00:17.583 1.33148 1.33135
I would like to group this data into two hour chunks and add some new columns.
Open Ask, Close Ask, Max Ask, Open Bid, Max Bid, Close Bid
datetime
2017-03-01 00:00
2017-03-01 02:00
...
Where OpenAsk is the first Ask price in the 2 hour period, Close Ask is the last Ask price in the two hour period, etc.
How would I go about this? I think I can use resample, but im not sure how to find Open Ask, Close Ask, etc.
Thanks.
Upvotes: 3
Views: 419
Reputation: 294228
quick answer
df.groupby(pd.TimeGrouper('2H')).agg(['first', 'last', 'max'])
ask bid
first last max first last max
datetime
2017-03-01 1.33145 1.33148 1.33148 1.33135 1.33135 1.33135
With the formatting to produce your desired results
fs = dict(Open='first', Close='last', Max='max')
ag = dict(Ask=fs, Bid=fs)
gp = pd.TimeGrouper('2H')
d1 = df.rename(columns=str.capitalize).groupby(gp).agg(ag)
d1.sort_index(axis=1, ascending=False, inplace=True)
d1.columns = d1.columns.map('{0[1]} {0[0]}'.format)
print(d1)
Open Ask Max Ask Close Ask Open Bid Max Bid Close Bid
datetime
2017-03-01 1.33145 1.33148 1.33148 1.33135 1.33135 1.33135
Upvotes: 2