Alex Wang
Alex Wang

Reputation: 481

Rename columns after groupby two columns

I'm trying to rename two columns after using a groupby of two columns.

fun = {'Age':{'mean_age':'mean', 'median_age':'median'}}
groupbyClass2 = mc_response.groupby(['Country','GenderSelect']).agg(fun).reset_index()
groupbyClass2.columns = groupbyClass2.columns.droplevel(0) 

The dataframe looks like the following:

                        mean_age    median_age
0   Argentina   Female  33.000000   33.0
1   Argentina   Male    33.294118   32.0
2   Australia   Female  35.000000   34.0
3   Australia   Male    37.158416   36.0

Now I want to rename the first column to 'Country' and the second column to 'Gender'. I tried the following code however the two columns will both be renamed to 'Gender'. How can I fix this?

groupbyClass2.rename(columns = {groupbyClass2.columns[0]:'Country', groupbyClass2.columns[1]:'Gender'},inplace = True)

Upvotes: 1

Views: 1234

Answers (2)

BhishanPoudel
BhishanPoudel

Reputation: 17164

This might be easier:

df.groupby(['Country','Gender'])['Age'].agg([np.mean,np.median]).add_suffix('_age').reset_index()

Output:

     Country  Gender  mean_age  median_age
0  Argentina  Female      10.0        10.0
1  Australia  Female      30.0        30.0
2  Australia    Male      31.5        31.5

Upvotes: 0

jezrael
jezrael

Reputation: 863751

You can specify column for aggregate after groupby, so possible pass list of tuples for new columns names with aggregate functions, add DataFrame.rename_axis for MultiIndex names for columns names after reset_index:

print (mc_response)
     Country GenderSelect  Age
0  Argentina       Female   10
1  Australia         Male   20
2  Australia       Female   30
3  Australia         Male   43

fun = [('mean_age', 'mean'), ('median_age','median')]
groupbyClass2 = (mc_response.groupby(['Country','GenderSelect'])['Age']
                            .agg(fun)
                            .rename_axis(['Country','Gender'])
                            .reset_index())
print (groupbyClass2)
     Country  Gender  mean_age  median_age
0  Argentina  Female      10.0        10.0
1  Australia  Female      30.0        30.0
2  Australia    Male      31.5        31.5

Your solution should working with assign new list - first values in list and then convert all columns with indexing:

df.columns = ['Country','Gender'] + df.columns[2:].tolist()

Upvotes: 3

Related Questions