Rashida
Rashida

Reputation: 491

I need to groupby only by the month and day part of 'Date' and get the maximum and minimum of each group

I have a big dataframe. I have the data from 2005 to 2014. Here is the head() of my dataframe:

             ID       Date Element  Data_Value
2   USC00087020 2005-12-06    TMAX         272
5   USC00084095 2006-07-25    TMAX         328
6   USC00084095 2011-07-26    TMAX         333
7   USC00088841 2008-10-26    TMAX         294
12  USC00085667 2015-10-07    TMAX         300
14  USC00087760 2013-04-02    TMAX         322
15  USR0000FCHE 2010-05-26    TMAX         311
16  USC00088841 2007-12-27    TMAX         256

I need to groupby by month and day irrespective of the year. I used following code:

df.groupby(pd.Grouper(key='Date',freq='M')).agg({'Data_Value':np.max})

and got the result like this:

            Data_Value
Date                  
2005-01-31         294
2005-02-28         300
2005-03-31         344
2005-04-30         322
2005-05-31         367
2005-06-30         383
2005-07-31         372
2005-08-31         361

But I need following format. So I can take the max and min of each month of all the years:

Date    Data_Value
Jan     217
Feb     240
Mar     228
Apr     190
May     250

How can I achieve it? Please help. Thanks in advance

Upvotes: 2

Views: 95

Answers (2)

Chris
Chris

Reputation: 29742

Using pandas.to_datetime:

import pandas as pd

df['Date'] = pd.to_datetime(df['Date'])
df.groupby(df['Date'].dt.month)['Data_Value'].max()

Date
4     322
5     311
7     333
10    300
12    272

Upvotes: 1

BENY
BENY

Reputation: 323226

You can first convert your Date columns back to date format by using to_datetime, then groupby with that columns only pick the month

df.Date=pd.to_datetime(df.Date)

df.groupby(df.Date.dt.strftime('%B')).Data_Value.max()
Out[290]: 
Date
April       322
December    272
July        333
May         311
October     300
Name: Data_Value, dtype: int64

Upvotes: 1

Related Questions