Reputation: 4322
I have a following DataFrame:
dates = ['2018-01-03 23:26:00', '2018-01-04 00:14:00', '2018-01-04 03:10:00', '2018-01-05 03:47:00',
'2018-01-05 04:47:00', '2018-01-06 05:44:00', '2018-01-06 19:00:00', '2018-01-06 20:36:00',
'2018-01-07 21:34:00']
vals = [59.95, 60.11, 62.05, 59.98, 60.01, 61.15, 60.35, 60.61, 59.99]
temp = pd.DataFrame({'date':dates, 'values':vals})
What I need to do is get rolling averages for the past 24 hours. I tried using pandas' rolling()
function, but there I can specify a window of how many data points to use for rolling calculations, I can have different number of data points for every 24-hour period, so simple use of the rolling function doesn't work for me.
I thought about resampling dataframe by date, by that wouldn't work either.
Not sure how to approach this. Any suggestions would be very welcome.
Upvotes: 1
Views: 104
Reputation: 19545
You can set the date as the index, then use the pandas rolling function with a set time period for the window.
import pandas as pd
dates = ['2018-01-03 23:26:00', '2018-01-04 00:14:00', '2018-01-04 03:10:00', '2018-01-05 03:47:00',
'2018-01-05 04:47:00', '2018-01-06 05:44:00', '2018-01-06 19:00:00', '2018-01-06 20:36:00',
'2018-01-07 21:34:00']
vals = [59.95, 60.11, 62.05, 59.98, 60.01, 61.15, 60.35, 60.61, 59.99]
temp = pd.DataFrame({'values':vals})
temp.index = [pd.Timestamp(date) for date in dates]
# create a new column with rolling average values
temp['rolling_avg'] = temp.rolling('24h', min_periods=1).mean()
Output:
>>> temp
values rolling_avg
2018-01-03 23:26:00 59.95 59.950000
2018-01-04 00:14:00 60.11 60.030000
2018-01-04 03:10:00 62.05 60.703333
2018-01-05 03:47:00 59.98 59.980000
2018-01-05 04:47:00 60.01 59.995000
2018-01-06 05:44:00 61.15 61.150000
2018-01-06 19:00:00 60.35 60.750000
2018-01-06 20:36:00 60.61 60.703333
2018-01-07 21:34:00 59.99 59.990000
Upvotes: 1