Reputation: 21961
id vi dates f_id
0 5532714 0.549501 2015-07-07 ff_22
1 5532715 0.540969 2015-07-08 ff_22
2 5532716 0.531477 2015-07-09 ff_22
3 5532717 0.521029 2015-07-10 ff_22
4 5532718 0.509694 2015-07-11 ff_22
In the dataframe above, I want to find average yearly value for each year. This does not work:
df.groupby(df.dates.year)['vi'].transform(mean)
I get this error: *** AttributeError: 'Series' object has no attribute 'year'
How to fix this?
Upvotes: 8
Views: 17505
Reputation: 4059
Updating and completing @piRsquared's example below for recent versions of pandas
(e.g. v1.1.0), using the Grouper
function instead of TimeGrouper
which was deprecated:
import pandas as pd
import numpy as np
tidx = pd.date_range('2010-01-01', '2013-12-31', name='dates')
np.random.seed([3,1415])
df = pd.DataFrame(dict(vi=np.random.rand(tidx.size)), tidx)
df.groupby(pd.Grouper(freq='1Y')).mean()
Upvotes: 1
Reputation: 294218
You can also use pd.TimeGrouper
with the frequency A
Consider the dataframe df
consisting of four years of daily data
tidx = pd.date_range('2010-01-01', '2013-12-31', name='dates')
np.random.seed([3,1415])
df = pd.DataFrame(dict(vi=np.random.rand(tidx.size)), tidx)
df.groupby(pd.TimeGrouper('A')).mean()
vi
dates
2010-12-31 0.465121
2011-12-31 0.511640
2012-12-31 0.491363
2013-12-31 0.516614
Upvotes: 0
Reputation: 153460
Let's make sure that dates is datetime dtype, then use the .dt
accessor as .dt.year
:
df['dates'] = pd.to_datetime(df.dates)
df.groupby(df.dates.dt.year)['vi'].transform('mean')
Output:
0 0.530534
1 0.530534
2 0.530534
3 0.530534
4 0.530534
Name: vi, dtype: float64
Upvotes: 11