idpd15
idpd15

Reputation: 456

How to apply custom function to 2 columns of a pandas dataframe?

I have a pandas dataframe with 2 columns , say A and B.
All the elements of the columns A and B are of type string.
eg

        A      B  
0      str1   str2  
1      str3   str4  
2      str5   str6  
3      str7   str8  

So, I have a function f which takes as input 2 strings, does some non trivial stuff and returns an output.
eg def f(x, y): "do something to x and y to make z" return z
What I want the output to look like is

        A      B      C
0      str1   str2  f(str1, str2)
1      str3   str4  f(str3, str4)
2      str5   str6  f(str5, str6)
3      str7   str8  f(str7, str8)

I don't want to use loops as it is a very big dataframe.
How to apply the function f in a vectorized way to the columns A and B?

Upvotes: 1

Views: 56

Answers (2)

Mykola Zotko
Mykola Zotko

Reputation: 17824

You can pass columns to a function as arguments but it depends on a function you have. For example:

df['C'] = np.add(df['A'], df['B'])

Result:

      A     B         C
0  str1  str2  str1str2
1  str3  str4  str3str4
2  str5  str6  str5str6
3  str7  str8  str7str8

Upvotes: 0

jezrael
jezrael

Reputation: 862611

How to apply the function f in a vectorized way to the columns A and B?

It is possible by:

df['new'] = df.apply(lambda x: f(x['A'], x['B']), axis=1)

but it is not vectorized, it is loops under the hoods.

Obviously for vectorized solution is necessary change your function for working with arrays, not scalars, what is not trivial with strings. Another idea is use cython or numba.

Upvotes: 2

Related Questions