Reputation: 2093
I have a panda time-series dataframe with DateTime as an index. I tried to replace the daily value by a long-term monthly average value. For example:
if my 2 years timeseries dataframe is something like:
df = pd.DataFrame({'data':np.random.rand(731)},index=pd.date_range('2000',periods=731))
Monthly mean:
mon_mean = df.groupby(df.index.month).mean()
And long term average looks like:
1 0.497286
2 0.536500
3 0.468002
4 0.477769
5 0.543201
6 0.520326
7 0.460261
8 0.524335
9 0.521869
10 0.516423
11 0.458476
12 0.494853
So what I want is to replace all the daily values in Jan by long-term Jan average value i.e 0.497286 and so on. But I was not able to do that.
Upvotes: 1
Views: 435
Reputation: 862511
Use GroupBy.transform
for set new column filled by aggregation values:
np.random.seed(2019)
df = pd.DataFrame({'data':np.random.rand(731)},index=pd.date_range('2000',periods=731))
df['mon'] = df.groupby(df.index.month)['data'].transform('mean')
print (df)
data mon
2000-01-01 0.903482 0.482155
2000-01-02 0.393081 0.482155
2000-01-03 0.623970 0.482155
2000-01-04 0.637877 0.482155
2000-01-05 0.880499 0.482155
... ...
2001-12-27 0.755412 0.519518
2001-12-28 0.858582 0.519518
2001-12-29 0.884738 0.519518
2001-12-30 0.265324 0.519518
2001-12-31 0.948137 0.519518
[731 rows x 2 columns]
Upvotes: 2