Reputation: 491
I have a big dataframe. I have the data from 2005 to 2014. Here is the head() of my dataframe:
ID Date Element Data_Value
2 USC00087020 2005-12-06 TMAX 272
5 USC00084095 2006-07-25 TMAX 328
6 USC00084095 2011-07-26 TMAX 333
7 USC00088841 2008-10-26 TMAX 294
12 USC00085667 2015-10-07 TMAX 300
14 USC00087760 2013-04-02 TMAX 322
15 USR0000FCHE 2010-05-26 TMAX 311
16 USC00088841 2007-12-27 TMAX 256
I need to groupby by month and day irrespective of the year. I used following code:
df.groupby(pd.Grouper(key='Date',freq='M')).agg({'Data_Value':np.max})
and got the result like this:
Data_Value
Date
2005-01-31 294
2005-02-28 300
2005-03-31 344
2005-04-30 322
2005-05-31 367
2005-06-30 383
2005-07-31 372
2005-08-31 361
But I need following format. So I can take the max and min of each month of all the years:
Date Data_Value
Jan 217
Feb 240
Mar 228
Apr 190
May 250
How can I achieve it? Please help. Thanks in advance
Upvotes: 2
Views: 95
Reputation: 29742
Using pandas.to_datetime
:
import pandas as pd
df['Date'] = pd.to_datetime(df['Date'])
df.groupby(df['Date'].dt.month)['Data_Value'].max()
Date
4 322
5 311
7 333
10 300
12 272
Upvotes: 1
Reputation: 323226
You can first convert your Date columns back to date format by using to_datetime
, then groupby
with that columns only pick the month
df.Date=pd.to_datetime(df.Date)
df.groupby(df.Date.dt.strftime('%B')).Data_Value.max()
Out[290]:
Date
April 322
December 272
July 333
May 311
October 300
Name: Data_Value, dtype: int64
Upvotes: 1