Reputation: 301
I have a dataset between 2002 - 2018 which contains 1 value per month, 198 rows in total.
I want to know how I can average all the values from the same month (e.g. January/2003 + ... + January/2018)
dateparse = lambda dates: pd.datetime.strptime(dates, '%Y-%m-%d')
df = pd.read_csv('turbidez.csv', parse_dates=['date'], index_col='date',date_parser=dateparse)
data = df['x']
data.head()
date
2002-07-31 8.466111
2002-08-31 6.234259
2002-09-30 8.160763
2002-10-31 4.927685
2002-11-30 8.125012
Searching a bit I visit this solution, but couldn't apply it properly to my data.
Thank you in advance for any assistance.
Upvotes: 0
Views: 555
Reputation: 29742
Use pandas.to_datetime
and pandas.Series.dt.month
:
# Sample data
date x
0 2002-07-31 8.466111
1 2003-07-31 6.234259
2 2002-09-30 8.160763
3 2003-09-30 4.927685
4 2002-11-30 8.125012
df["date"] = pd.to_datetime(df["date"] )
new_df = df.groupby(df["date"].dt.month).sum()
print(new_df)
Output:
x
date
7 14.700370
9 13.088448
11 8.125012
Upvotes: 1