Rob Kulseth
Rob Kulseth

Reputation: 193

Merging a pandas groupby result back into DataFrame

I have a DataFrame that looks like this...

   idn value  
0  ID1    25
1  ID1    30
2  ID2    30
3  ID2    50

I want to add another column to this frame that is the max 'value' grouped by 'idn'

I want a result that looks like this.

   idn value  max_val
0  ID1    25       30
1  ID1    30       30
2  ID2    30       50
3  ID2    50       50

I can extract the max of 'value' using a group by like so...

df[['idn', 'value']].groupby('idn')['value'].max()

However, I am unable to merge that result back into the original DataFrame.

What is the best way to get the desired result?

Thank You

Upvotes: 18

Views: 12218

Answers (3)

ur_mom
ur_mom

Reputation: 1

For anyone else that is interested. The one-liner method for this is to use reset_index.

as such:

df['value_max'] = df[['idn', 'value']].groupby('idn')['value'].max().reset_index(['idn', 'value'], drop = True)

Upvotes: 0

Paul H
Paul H

Reputation: 68116

Use the transform method on a groupby object:

In [5]: df['maxval'] = df.groupby(by=['idn']).transform('max')

In [6]: df
Out[6]: 
   idn  value  maxval
0  ID1     25      30
1  ID1     30      30
2  ID2     30      50
3  ID2     50      50

Upvotes: 20

Haleemur Ali
Haleemur Ali

Reputation: 28233

set the index of df to idn, and then use df.merge. after the merge, reset the index and rename columns

dfmax = df.groupby('idn')['value'].max()

df.set_index('idn', inplace=True)

df = df.merge(dfmax, how='outer', left_index=True, right_index=True)

df.reset_index(inplace=True)

df.columns = ['idn', 'value', 'max_value']

Upvotes: 2

Related Questions