Reputation: 3221
Given a series like this
Date
2005-01-01 128
2005-01-02 72
2005-01-03 67
2005-01-04 61
2005-01-05 33
Name: Data_Value, dtype: int64
for several years, how do I group all the January 1sts together, all the January 2nds, etc?
I'm actually trying to find the max for each day of the year across several years, so it does not have to be groupby. If there is an easier way to do this, that would be great.
Upvotes: 4
Views: 11457
Reputation: 369
Why to not just keep it simple!
max_temp = dfall.groupby([(dfall.Date.dt.month),(dfall.Date.dt.day)])['Data_Value'].max()
Upvotes: 2
Reputation: 1049
As an alternative, you can use a pivot table:
Reset index and format date columns
df=df.reset_index()
df['date']=pd.to_datetime(df['index'])
df['year']=df['date'].dt.year
df['month']=df['date'].dt.month
df['day']=df['date'].dt.day
Pivot over the month and day columns:
df_grouped=df.pivot_table(index=('month','day'),values='Date',aggfunc='max')
Upvotes: 2
Reputation: 402293
You can convert your index to datetime, then use strftime
to get a date formatted string to group on:
df.groupby(pd.to_datetime(df.index).strftime('%b-%d'))['Date_Value'].max()
If there are no NaNs in your date string, you can slice as well. This returns strings of the format "MM-DD":
df.groupby(df.index.astype(str).str[5:])['Date_Value'].max()
Upvotes: 4