Reputation: 1955
i have a function which should take x , y , z as input and returns r as output. For example : my_func( x , y, z) takes x = 10 , y = 'apple' and z = 2 and returns value in column r. Similarly, function takes x = 20, y = 'orange' and z =4 and populates values in column r. Any suggestions what would be the efficient code for this ?
Before :
a x y z
5 10 'apple' 2
2 20 'orange' 4
0 4 'apple' 2
5 5 'pear' 6
After:
a x y z r
5 10 'apple' 2 x
2 20 'orange' 4 x
10 4 'apple' 2 x
5 5 'pear' 6 x
Upvotes: 2
Views: 1365
Reputation: 117540
Depends on how complex your function is. In general you can use pandas.DataFrame.apply
:
>>> def my_func(x):
... return '{0} - {1} - {2}'.format(x['y'],x['a'],x['x'])
...
>>> df['r'] = df.apply(my_func, axis=1)
>>> df
a x y z r
0 5 10 'apple' 2 'apple' - 5 - 10
1 2 20 'orange' 4 'orange' - 2 - 20
2 0 4 'apple' 2 'apple' - 0 - 4
3 5 5 'pear' 6 'pear' - 5 - 5
axis=1
is to make your function work 'for each row' instead of 'for each column`:
Objects passed to functions are Series objects having index either the DataFrame’s index (axis=0) or the columns (axis=1)
But if it's really simple function, like the one above, you probably can even do it without function, with vectorized operations.
Upvotes: 3