Reputation: 12107
There are a lot of similar questions, all of them with they specific issues and answers, but I haven't found a fitting solution, nor an understanding on how to do it.
I have typical data:
date open high low close volume spot
1507842000 5313.3 5345.6 5272 5295.1 22612561 5301.462201
1507845600 5295.1 5326.7 5286.1 5301.1 12127159 5308.487754
1507849200 5301.1 5467.5 5301.1 5464.5 54568881 5401.331605
1507852800 5464.7 5497 5394.9 5402.5 58411322 5446.552171
1507856400 5402.1 5542 5402.1 5541.2 50272286 5466.652636
1507860000 5540.4 5980 5440.1 5694.5 182746217 5717.856124
1507863600 5689.8 5800 5604.5 5739.6 78341266 5709.488508
1507867200 5742 5897 5713.1 5753.2 79738461 5794.402674
1507870800 5753.1 5798.9 5520.3 5574.5 87621428 5640.727381
1507874400 5574.6 5672.6 5503.2 5608.4 56964404 5591.237093
1507878000 5607.5 5689.1 5570 5660 46132190 5640.761482
1507881600 5660 5743 5634.8 5652 50173714 5690.219952
but not just OHLC, but also volume and spot price.
I am trying to resample hours to days.
so, I load the csv:
data_hourly = pd.read_csv('../data/hourly.csv', parse_dates=True, date_parser=date_parse, index_col=0, header=0)
(the date_parse function is removing the minutes / seconds)
I tried:
data_daily = data_hourly.resample('1D').ohlc()
and, this clearly doesn't work at all; giving me rows with a large amount of columns.
and I tried:
columns_dict = {'open': 'first', 'high': 'max', 'low': 'min', 'close': 'last', 'volume': 'sum', 'spot': 'average'}
data_daily = data_hourly.resample('1D', how=columns_dict)
but this crashes with an error:
"%r object has no attribute %r" % (type(self).name, attr) AttributeError: 'SeriesGroupBy' object has no attribute 'average'
besides, it tells me the 'how' field is deprecated anyways, but I didn't see a sample to do it the 'new' way.
Upvotes: 2
Views: 2389
Reputation: 862801
You are close, need mean
instead average
and pass it to Resampler.agg
:
columns_dict = {'open': 'first', 'high': 'max', 'low': 'min',
'close': 'last', 'volume': 'sum', 'spot': 'mean'}
data_daily = data_hourly.resample('1D').agg(columns_dict)
print (data_daily)
open high low close volume spot
date
2017-10-12 5313.3 5467.5 5272.0 5464.5 89308601 5337.093853
2017-10-13 5464.7 5980.0 5394.9 5652.0 690401288 5633.099780
Upvotes: 4