M.E.
M.E.

Reputation: 5496

Get monthly average in pandas

I have the following time series:

        Date        Value
0       2006-01-03  18
1       2006-01-04  12
2       2006-01-05  11
3       2006-01-06  10
4       2006-01-09  22
...     ...     ...
3510    2019-12-23  47
3511    2019-12-24  46
3512    2019-12-26  35
3513    2019-12-27  35
3514    2019-12-30  28

I want to calculate the average values per month. So the pseudocode for each month is as follows:

  1. Sum all the values for each day present in that month
  2. Divide by the number of days with data for that month.

The desired output would be something similar to:

        Date        Value
0       2006-01     17.45
1       2006-02     18.23
2       2006-04     16.79
3       2006-05     17.98
...     ...     ...
166     2019-11     37.89
167     2019-12     36.34

I have tried this without success:

data = data.set_index('Date')
data.resample('M')

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-28-435afe449f1f> in <module>
     47 data = pd.DataFrame(dataList, columns=('Date', 'Value'))
     48 data = data.set_index('Date')
---> 49 data.resample('M')

Upvotes: 9

Views: 33457

Answers (2)

cs95
cs95

Reputation: 402333

We can convert your datetime column into a PeriodIndex on monthly frequency, then take the mean using GroupBy.mean:

df.groupby(pd.PeriodIndex(df['Date'], freq="M"))['Value'].mean()
    
Date
2006-01    14.6
2019-12    38.2
Freq: M, Name: Value, dtype: float64

df.groupby(pd.PeriodIndex(df['Date'], freq="M"))['Value'].mean().reset_index()

      Date  Value
0  2006-01   14.6
1  2019-12   38.2

One caveat of this approach is that missing months are not shown. If that's important, use set_index and resample.mean in the same way.

Upvotes: 14

nunohpinheiro
nunohpinheiro

Reputation: 2269

You could try something like this, which even does not require the change on the index:

data_month = data.resample('M', on='Date').mean()

Please, note that the resample itself does not do the trick by itself. The .mean() is required.

More on the documentation :)

Upvotes: 10

Related Questions