Reputation: 1553
I have a dataframe with about 20 columns, but I need groupby one columns called ID and calculate difference of, let's called them, value1 and value 2 columns. Example df:
ID value1 value2
1 3 2
1 6 2
2 6 1
3 5 8
4 7 2
4 3 2
Desired output:
ID value1 value2 maxabs
1 3 2 4
1 6 2 4
2 6 1 5
3 5 8 3
4 7 2 5
4 3 2 5
I've tried simply with this:
df['maxabs'] = df.groupby(['ID'])[(df['value1'] - df['value2'])].abs().idxmax()
Error said that columns are not found and printed me a lot of 'nan'. Columns are there, surely. Maybe I need to add when both values are 'nan' to print 'nan; But not sure that I even hit the direction.
Upvotes: 2
Views: 3552
Reputation: 323396
Try this . Ps. you can also use merge
or join
, I just get used to map
df['maxabs']=df.ID.map(df.groupby(['ID']).apply(lambda x: max(abs(x.value1-x.value2))))
ID value1 value2 maxabs
0 1 3 2 4
1 1 6 2 4
2 2 6 1 5
3 3 5 8 3
4 4 7 2 5
5 4 3 2 5
Upvotes: 2
Reputation: 215137
Switch the order of your calculation; Calculate the difference between value1 and value2 firstly, and then group by ID and calculate max
with transform
:
df['maxabs'] = df.value1.sub(df.value2).abs().groupby(df.ID).transform('max')
df
# ID value1 value2 maxabs
#0 1 3 2 4
#1 1 6 2 4
#2 2 6 1 5
#3 3 5 8 3
#4 4 7 2 5
#5 4 3 2 5
Upvotes: 4