rajeev
rajeev

Reputation: 323

Group by month and date - python

I have a set of data points as shown below. I have similar data points for each day of the year for every year

data1 = {'Date': {1: '01-01-2001',
  2: '01-01-2002',
  3: '01-01-2003',
  4: '01-01-2004',
  5: '01-01-2005',
  6: '01-01-2006',
  7: '01-01-2007',
  8: '01-01-2008',
  9: '01-01-2009',
 10: '01-01-2010'
   },
 'milesrun': {1: '15',
  2: '21',
  3: '19',
  4: '22',
  5: '16',
  6: '13',
  7: '22',
  8: '24',
  9: '17',
 10: '18'}}

How can I group by just the month and the date and get the minimum and the maximum values? I need the results in 2 separate dataframes one for minimum and one for maximum. The resultant data frame should have the output as below

mth-date  value
 01-01     13
 01-02     14
 01-03     13
 ............
 ............
 12-29     18
 12-30     16
 12-31     17

I was able to extract the month and the date as separate columns but unable to groupby correctly. I am trying to avoid using numpy and see if there is an alternative option

Upvotes: 1

Views: 721

Answers (1)

jezrael
jezrael

Reputation: 862641

Use dt.strftime for extract months with days and then use groupby with aggregate min, max:

#if neccessary convert to datetimes if MM-DD-YYYY
#df['Date'] = pd.to_datetime(df['Date'])

dfboth = df['milesrun'].groupby(df['Date'].dt.strftime('%m-%d')).agg(['min','max'])
print (dfboth)
      min max
Date         
01-01  13  24

Or:

dfmin = df['milesrun'].groupby(df['Date'].dt.strftime('%m-%d')).min()
dfmax = df['milesrun'].groupby(df['Date'].dt.strftime('%m-%d')).max()

Upvotes: 1

Related Questions