Reputation: 215
I have a dataframe with days and downloads per user:
dates downloadsperuser
2004-01-02 12.51118760757315
2004-01-03 6.990049751243781
2004-01-04 6.8099547511312215
2004-01-05 22.513349514563107
2004-01-06 22.348538011695908
2004-01-07 23.895180722891567
2004-01-08 21.765680473372782
2004-01-09 20.34256926952141
2004-01-10 9.455938697318008
...
2004-02-01 9.196078431372548
2004-02-02 21.558398220244715
2004-02-03 22.293007769145394
2004-02-04 22.324115044247787
2004-02-05 21.88482834994463
2004-02-06 20.236781609195404
2004-02-07 10.708823529411765
2004-02-08 10.835329341317365
2004-02-09 24.87350054525627
2004-02-10 24.167035398230087
2004-02-11 22.676117775354417
2004-02-12 23.384444444444444
2004-02-13 20.674285714285713
2004-02-14 10.74914089347079
2004-02-15 11.64873417721519
...
2004-03-01 23.36965811965812
2004-03-02 23.127545551982852
2004-03-03 23.60235798499464
2004-03-04 23.634015069967706
2004-03-05 20.468996617812852
2004-03-06 6.608208955223881
2004-03-07 5.570446735395189
2004-03-08 23.48093220338983
2004-03-09 25.734190782422292
2004-03-10 24.919652551574377
...
And I want to calculate the average mean. So far I tried:
df = pd.read_csv('downloadsperuser.csv', parse_dates=True)
df['dates']=pd.to_datetime(df['dates'])
df['month'] = pd.PeriodIndex(df.dates, freq='M')
df['month'].value_counts().sort_index()
And become the month of the days. But I have no idea how I can sum up all the values in the column downloadsperuser
per month..
Upvotes: 0
Views: 9391
Reputation: 3739
First calculate month and year then groupby to find mean :
df['month'] = pd.to_datetime(df['date']).dt.month
df['year'] = pd.to_datetime(df['date']).dt.year
df.groupby(['year','month'],as_index=False).mean()
Upvotes: 3
Reputation: 3528
You can try:
df.date = pd.to_datetime(df.date)
df_2 = df.groupby(df.date.dt.strftime('%Y-%m')).downloadsperuser.agg(['mean'])
Upvotes: 2