applying defined function over different ranges of rows in pandas

Question

I have a df like this with tons of rows :

    BB      AA    FF        
     2      5      0    
     3      7      A
     6      5      A
     9      6      A
     8      3      0

And a function like this :

def test(a,b):
    # a=array col AA
    # b=array col BB
    return (a*b)+a

I would like that for the rows in column FF where values are != 0 to apply the function test over that slice (array) of the df that involves column BB and AA to generate the following output in the new column ZZ:

    BB      AA    FF   ZZ      
     2      5      0   0 
     3      7      A   28
     6      5      A   35
     9      6      A   51
     8      3      0   0

I was thinking in something like:

df['zz']= df.apply(lambda x: test(df.AA,df.BB) for the range of values among zero)

But my issue is that I am not sure on how to specify de arrays in column FF to apply the column

ansev · Accepted Answer

You can use DataFrame.apply + mask:

def test(x):
    return (x[0]*x[1])+x[0]
df['ZZ']=df[['AA','BB']].apply(test,axis=1).mask(df['FF'].eq('0'),0)
print(df)

   BB  AA FF  ZZ
0   2   5  0   0
1   3   7  A  28
2   6   5  A  35
3   9   6  A  60
4   8   3  0   0

or you can use lambda function:

df['ZZ']=df.apply(lambda x: x[['BB','AA']].prod()+ x['AA'] if x['FF'] != '0' else x['FF'],axis=1)
print(df)

   BB  AA FF  ZZ
0   2   5  0   0
1   3   7  A  28
2   6   5  A  35
3   9   6  A  60
4   8   3  0   0

applying defined function over different ranges of rows in pandas

Answers (1)

Related Questions