Bahlsen
Bahlsen

Reputation: 177

Pandas groupby sum, keep specific column in the resulting data frame

Let's assume I have the following data set.

Name   Asset    Group   PN
Anton  10       Admin   124223
Brad   15       Planer  774555
Kevin  20       Admin   234987
Brad   10       Planer  774555
Peter  25       Planer  861254
Anton  10       Admin   124223
Peter  5        Planer  861254
Kevin  35       Admin   234987
David  10       AT      561256 

Now I want to aggregate over the column name, so I do:

agg = df.groupby('Name', as_index=False)['Asset'].sum()

Works like a charm. My output looks like this:

Name   Asset    
Anton  20       
Brad   25       
Kevin  55       
Peter  30       
David  10      

However, I'd like to Keep column "Group" without changing anything of this column, so that my output is going to look like this:

Name   Asset   Group 
Anton  20      Admin 
Brad   25      Planer 
Kevin  55      Admin 
Peter  30      Planer 
David  10      AT

How can I achieve the latter result?

Thank you very much for your help!

Upvotes: 0

Views: 1007

Answers (1)

jezrael
jezrael

Reputation: 862911

If possible each Name has same group then add column to list and aggregate sum:

agg = df.groupby(['Name', 'Group'], as_index=False, sort=False)['Asset'].sum()
print (agg)
    Name   Group  Asset
0  Anton   Admin     20
1   Brad  Planer     25
2  Kevin   Admin     55
3  Peter  Planer     30
4  David      AT     10

Or if need to add anya another column, then need aggregate all columns, e.g. by first or last:

agg = (df.groupby('Name', as_index=False, sort=False)
         .agg({'Asset':'sum', 'Group':'first', 'PN':'last'}))
print (agg)
    Name  Asset   Group      PN
0  Anton     20   Admin  124223
1   Brad     25  Planer  774555
2  Kevin     55   Admin  234987
3  Peter     30  Planer  861254
4  David     10      AT  561256

Upvotes: 3

Related Questions