Reputation: 456
I have a pandas dataframe with 2 columns , say A and B.
All the elements of the columns A and B are of type string.
eg
A B
0 str1 str2
1 str3 str4
2 str5 str6
3 str7 str8
So, I have a function f which takes as input 2 strings, does some non trivial stuff and returns an output.
eg def f(x, y):
"do something to x and y to make z"
return z
What I want the output to look like is
A B C
0 str1 str2 f(str1, str2)
1 str3 str4 f(str3, str4)
2 str5 str6 f(str5, str6)
3 str7 str8 f(str7, str8)
I don't want to use loops as it is a very big dataframe.
How to apply the function f in a vectorized way to the columns A and B?
Upvotes: 1
Views: 56
Reputation: 17824
You can pass columns to a function as arguments but it depends on a function you have. For example:
df['C'] = np.add(df['A'], df['B'])
Result:
A B C
0 str1 str2 str1str2
1 str3 str4 str3str4
2 str5 str6 str5str6
3 str7 str8 str7str8
Upvotes: 0
Reputation: 862611
How to apply the function f in a vectorized way to the columns A and B?
It is possible by:
df['new'] = df.apply(lambda x: f(x['A'], x['B']), axis=1)
but it is not vectorized, it is loops under the hoods.
Obviously for vectorized solution is necessary change your function for working with arrays, not scalars, what is not trivial with strings. Another idea is use cython or numba.
Upvotes: 2