user308827
user308827

Reputation: 21961

Find annual average of pandas dataframe with date column

        id      vi       dates     f_id
0  5532714  0.549501  2015-07-07    ff_22
1  5532715  0.540969  2015-07-08    ff_22
2  5532716  0.531477  2015-07-09    ff_22
3  5532717  0.521029  2015-07-10    ff_22
4  5532718  0.509694  2015-07-11    ff_22

In the dataframe above, I want to find average yearly value for each year. This does not work:

df.groupby(df.dates.year)['vi'].transform(mean)

I get this error: *** AttributeError: 'Series' object has no attribute 'year' How to fix this?

Upvotes: 8

Views: 17505

Answers (3)

tags
tags

Reputation: 4059

Updating and completing @piRsquared's example below for recent versions of pandas (e.g. v1.1.0), using the Grouper function instead of TimeGrouper which was deprecated:

import pandas as pd
import numpy as np
tidx = pd.date_range('2010-01-01', '2013-12-31', name='dates')
np.random.seed([3,1415])
df = pd.DataFrame(dict(vi=np.random.rand(tidx.size)), tidx)
df.groupby(pd.Grouper(freq='1Y')).mean()

Upvotes: 1

piRSquared
piRSquared

Reputation: 294218

You can also use pd.TimeGrouper with the frequency A

Consider the dataframe df consisting of four years of daily data

tidx = pd.date_range('2010-01-01', '2013-12-31', name='dates')

np.random.seed([3,1415])
df = pd.DataFrame(dict(vi=np.random.rand(tidx.size)), tidx)

df.groupby(pd.TimeGrouper('A')).mean()

                  vi
dates               
2010-12-31  0.465121
2011-12-31  0.511640
2012-12-31  0.491363
2013-12-31  0.516614

Upvotes: 0

Scott Boston
Scott Boston

Reputation: 153460

Let's make sure that dates is datetime dtype, then use the .dt accessor as .dt.year:

df['dates'] = pd.to_datetime(df.dates)
df.groupby(df.dates.dt.year)['vi'].transform('mean')

Output:

0    0.530534
1    0.530534
2    0.530534
3    0.530534
4    0.530534
Name: vi, dtype: float64

Upvotes: 11

Related Questions