Reputation: 2273
I have a df like this with tons of rows :
BB AA FF
2 5 0
3 7 A
6 5 A
9 6 A
8 3 0
And a function like this :
def test(a,b):
# a=array col AA
# b=array col BB
return (a*b)+a
I would like that for the rows in column FF where values are != 0 to apply the function test over that slice (array) of the df that involves column BB and AA to generate the following output in the new column ZZ:
BB AA FF ZZ
2 5 0 0
3 7 A 28
6 5 A 35
9 6 A 51
8 3 0 0
I was thinking in something like:
df['zz']= df.apply(lambda x: test(df.AA,df.BB) for the range of values among zero)
But my issue is that I am not sure on how to specify de arrays in column FF to apply the column
Upvotes: 0
Views: 324
Reputation: 30920
You can use DataFrame.apply + mask:
def test(x):
return (x[0]*x[1])+x[0]
df['ZZ']=df[['AA','BB']].apply(test,axis=1).mask(df['FF'].eq('0'),0)
print(df)
BB AA FF ZZ
0 2 5 0 0
1 3 7 A 28
2 6 5 A 35
3 9 6 A 60
4 8 3 0 0
or you can use lambda function:
df['ZZ']=df.apply(lambda x: x[['BB','AA']].prod()+ x['AA'] if x['FF'] != '0' else x['FF'],axis=1)
print(df)
BB AA FF ZZ
0 2 5 0 0
1 3 7 A 28
2 6 5 A 35
3 9 6 A 60
4 8 3 0 0
Upvotes: 1