jovicbg
jovicbg

Reputation: 1553

Groupby one column and find max (absolute) value of difference of other two columns in pandas

I have a dataframe with about 20 columns, but I need groupby one columns called ID and calculate difference of, let's called them, value1 and value 2 columns. Example df:

ID  value1   value2
1     3         2
1     6         2
2     6         1
3     5         8
4     7         2
4     3         2

Desired output:

ID  value1   value2  maxabs
1     3         2      4
1     6         2      4
2     6         1      5
3     5         8      3
4     7         2      5
4     3         2      5

I've tried simply with this:

df['maxabs'] = df.groupby(['ID'])[(df['value1'] - df['value2'])].abs().idxmax()

Error said that columns are not found and printed me a lot of 'nan'. Columns are there, surely. Maybe I need to add when both values are 'nan' to print 'nan; But not sure that I even hit the direction.

Upvotes: 2

Views: 3552

Answers (2)

BENY
BENY

Reputation: 323396

Try this . Ps. you can also use merge or join , I just get used to map

df['maxabs']=df.ID.map(df.groupby(['ID']).apply(lambda x: max(abs(x.value1-x.value2))))

   ID  value1  value2  maxabs
0   1       3       2       4
1   1       6       2       4
2   2       6       1       5
3   3       5       8       3
4   4       7       2       5
5   4       3       2       5

Upvotes: 2

akuiper
akuiper

Reputation: 215137

Switch the order of your calculation; Calculate the difference between value1 and value2 firstly, and then group by ID and calculate max with transform:

df['maxabs'] = df.value1.sub(df.value2).abs().groupby(df.ID).transform('max')

df
#  ID   value1  value2  maxabs
#0  1        3       2       4
#1  1        6       2       4
#2  2        6       1       5
#3  3        5       8       3
#4  4        7       2       5
#5  4        3       2       5

Upvotes: 4

Related Questions