Dance Party2
Dance Party2

Reputation: 7536

Pandas Groupby Max of Multiple Columns

Given this data frame:

import pandas as pd
df = pd.DataFrame({'group':['a','a','b','c','c'],'strings':['ab','   ','   ','12','  '],'floats':[7.0,8.0,9.0,10.0,11.0]})

    group strings  floats
0     a      ab     7.0
1     a             8.0
2     b             9.0
3     c      12    10.0
4     c            11.0

I want to group by "group" and get the max value of strings and floats.

Desired result:

      strings  floats
group                
a          ab     8.0
b                 9.0
c          12    11.0

I know I can just do this:

df.groupby(['group'], sort=False)['strings','floats'].max()

But in reality, I have many columns so I want to refer to all columns (save for "group") in one go.

I wish I could just do this:

df.groupby(['group'], sort=False)[x for x in df.columns if x != 'group'].max()

But, alas, "invalid syntax".

Upvotes: 1

Views: 101

Answers (1)

jezrael
jezrael

Reputation: 862406

If need max of all columns without group is possible use:

df = df.groupby('group', sort=False).max()
print (df)
      strings  floats
group                
a          ab     8.0
b                 9.0
c          12    11.0

Your second solution working if add next []:

df = df.groupby(['group'], sort=False)[[x for x in df.columns if x != 'group']].max()
print (df)
      strings  floats
group                
a          ab     8.0
b                 9.0
c          12    11.0

Upvotes: 2

Related Questions