noob
noob

Reputation: 3811

Use key of groupby to create another column pandas python

df

order_date    month      year   Days  Data
2015-12-20     12        2014    1     3
2016-1-21      1         2014    2     3
2015-08-20     8         2015    1     1 
2016-04-12     4         2016    4     1

and so on

Code:(finding mean, min, and median of days column and finding number of order_dates month wise for each respective year)

df1 = (df.groupby(["year", "month"])
     .agg(Min_days=("days", 'min'),
          Avg_days=("days", 'mean'),
          Median_days=('days','median'),
          Count = ('order_date', 'count'))
     .reset_index())

df1

   year  month  Min_days    Avg_days    Median_days     Count
    2015   1       9        12.56666666          10         4
    2015   2       10       13.67678788          9          3    
   ........................................................
    2016   12       12       15.7889990           19          2
    and so on...

Issue at hand:

I want to have another column month name in the table using key month from df1. Im doing this:

Output I want:

    year month Min_days       Avg_days    Median_days     Count    Month Name
    2015   1       9        12.56666666          10         4       Jan
    2015   2       10       13.67678788          9          3       Feb
   ........................................................
    2016   12       12       15.7889990         19          2     Dec
    and so on...

import calendar
df1['Month Name']=df1['month'].apply(lambda x:calendar.month_abbr[x])

But I am getting keyword error: month. I am unable to use key month to create another column month name. Pls help

Upvotes: 1

Views: 65

Answers (2)

anky
anky

Reputation: 75080

You can try map with get_level_values for multiindex mapping:

s = df1.index.get_level_values(1).map({i:e for i,e in enumerate([*calendar.month_abbr])})
df1 = df1.assign(Month=pd.Series(s,index=df1.index)

Or even simpler without apply ,if the dataframe is not a multi index and already reset, just use

df1 = df1.assign(Month=np.array(calendar.month_abbr)[df1['month']])

Upvotes: 1

jezrael
jezrael

Reputation: 862406

It seems there is no column month, resaon should be month is level of MultiIndex.

Check it :

print (df1.index.names)
print (df1.columns.tolist())

So need:

df1 = df1.reset_index()
df1['Month Name']=df1['month'].apply(lambda x:calendar.month_abbr[x])

Upvotes: 1

Related Questions