Reputation: 75
I am working on the following DataFrame:
df
Out[1]:
temp_C
date
2013-01-01 12
2013-01-02 11
2013-01-03 10
2013-01-04 9
2013-01-05 10
2013-01-06 10
2013-01-07 11
2013-01-08 12
2013-01-09 14
2013-01-10 14
2013-01-11 12
2013-01-12 12
2013-01-13 11
2013-01-14 10
2013-01-15 10
2013-01-16 12
2013-01-17 13
...
2017-01-02 8
2017-01-03 8
2017-01-04 8
2017-01-05 9
2017-01-06 9
2017-01-07 10
2017-01-08 12
2017-01-09 14
2017-01-10 14
2017-01-11 10
2017-01-12 10
2017-01-13 11
2017-01-14 14
2017-01-15 13
2017-01-16 10
2017-01-17 9
[1770 rows x 1 columns]
What I need to do is to group the values by the day of the year, find the mean (or median) values of each group, and thus obtaining a new DataFrame, in which the values of each day is the mean/median/... of all the values for the same day.
Here's an example:
df_grouped
Out[2]:
temp_C
date
2013-01-01 12
2014-01-01 10
2015-01-01 10
2016-01-01 12
2017-01-01 11
2013-01-02 11
2014-01-02 10
....
2016-12-31 8
2017-12-31 7
df_mean
Out[3]:
temp_C
date
1970-01-01 11 #the year is not meaningful anymore
1970-01-02 11.4
1970-01-03 12.5
....
1970-12-30 7.5
1970-12-31 7.5
Thank you.
Upvotes: 1
Views: 1389
Reputation: 294516
df = pd.DataFrame(
{'temp_C': range(10)},
pd.to_datetime([
'2010-01-23', '2012-03-30',
'2013-01-23', '2013-03-30',
'2014-01-23', '2014-03-30',
'2016-01-23', '2015-03-30',
'2017-01-23', '2017-03-30',
])
)
groupby
df.groupby('{:%m-%d}'.format).mean()
temp_C
01-23 4
03-30 5
Strings have a format
method that you can use as a callable. It takes arguments that get processed and interpolated as a new string.
'{:%m-%d}'.format
is a callable that takes a single positional argument and gets processed by what is in the {}
within the string. In this case '{:%m-%d}'
is specific to handling dates and the formatting can be better understood here. It says to when looking at a date, format it as month-day.
When passing a callable to groupby
it applies that callable to each element of the index. Since our index is Datetime
then each element gets returned as the month and day. That is precisely what we wanted in order to take our mean
.
Upvotes: 2