Reputation: 5496
I have the following time series:
Date Value
0 2006-01-03 18
1 2006-01-04 12
2 2006-01-05 11
3 2006-01-06 10
4 2006-01-09 22
... ... ...
3510 2019-12-23 47
3511 2019-12-24 46
3512 2019-12-26 35
3513 2019-12-27 35
3514 2019-12-30 28
I want to calculate the average values per month. So the pseudocode for each month is as follows:
The desired output would be something similar to:
Date Value
0 2006-01 17.45
1 2006-02 18.23
2 2006-04 16.79
3 2006-05 17.98
... ... ...
166 2019-11 37.89
167 2019-12 36.34
I have tried this without success:
data = data.set_index('Date')
data.resample('M')
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-28-435afe449f1f> in <module>
47 data = pd.DataFrame(dataList, columns=('Date', 'Value'))
48 data = data.set_index('Date')
---> 49 data.resample('M')
Upvotes: 9
Views: 33457
Reputation: 402333
We can convert your datetime column into a PeriodIndex
on monthly frequency, then take the mean using GroupBy.mean
:
df.groupby(pd.PeriodIndex(df['Date'], freq="M"))['Value'].mean()
Date
2006-01 14.6
2019-12 38.2
Freq: M, Name: Value, dtype: float64
df.groupby(pd.PeriodIndex(df['Date'], freq="M"))['Value'].mean().reset_index()
Date Value
0 2006-01 14.6
1 2019-12 38.2
One caveat of this approach is that missing months are not shown. If that's important, use set_index
and resample.mean
in the same way.
Upvotes: 14
Reputation: 2269
You could try something like this, which even does not require the change on the index:
data_month = data.resample('M', on='Date').mean()
Please, note that the resample itself does not do the trick by itself. The .mean()
is required.
More on the documentation :)
Upvotes: 10