In Pandas how can I use transform and use information from other columns?

Question

I want to use R-style mutate function, where I can use information from other columns. For example: I want to create a new column whose values are a result of first grouping the variables, and then interpolating one column vs. another column in the same data frame. The new column gets the same value for each group.

I tried to use apply with broadcast, however, it only results in NaN values.

import pandas as pd
import numpy as np

d = {'Gain': [20, 20,19,18,17,21,21,20,19,18],
     'Power':[30,31,32,33,34,33,34,35,36,37],
     'GRP':  ['A','A','A','A','A','B','B','B','B','B'],
     }
df = pd.DataFrame(data=d)

# Subtract the value of Gain from the maximum value: THIS STEP WORKS
df['dGain']=df.groupby(['GRP'])['Gain'].transform(lambda x: max(x) - x)

# DOES NOT WORK!!!
df['Pcomp']=df.groupby(['GRP']).transform(lambda x: 
np.interp(3,x.dGain,x.Power)) 

# DOES NOT WORK
df['Pcomp']=df.groupby(['GRP']).apply(lambda x: np.interp(3,x.dGain,x.Power))

I expected:

  Gain  Power GRP  Pcomp  dGain
0    20     30   A     33      0
1    20     31   A     33      0
2    19     32   A     33      1
3    18     33   A     33      2
4    17     34   A     33      3
5    21     33   B     36      0
6    21     34   B     36      0
7    20     35   B     36      1
8    19     36   B     36      2
9    18     37   B     36      3

BENY · Accepted Answer

We can say, transform almost equal to mutate in R dplyr , however, they still have slightly different , under the groupby object ,transform can pass one , mutate can do multiple , More info

A quick fix

df['Pcomp']=df.groupby('GRP').apply(lambda x: np.interp(3,x['dGain'],x['Power'])).reindex(df.GRP).values
df
Out[828]: 
   Gain  Power GRP  dGain  Pcomp
0    20     30   A      0   34.0
1    20     31   A      0   34.0
2    19     32   A      1   34.0
3    18     33   A      2   34.0
4    17     34   A      3   34.0
5    21     33   B      0   37.0
6    21     34   B      0   37.0
7    20     35   B      1   37.0
8    19     36   B      2   37.0
9    18     37   B      3   37.0

In Pandas how can I use transform and use information from other columns?

Answers (1)

Related Questions