Reputation: 87
The following Python code works fine
import pandas as pd
df = pd.DataFrame(data = {'a': [1, 2, 3], 'b': [4, 5, 6]})
def myfun(a, b):
return [a + b, a - b]
df[['x', 'y']] = df.apply(
lambda row: myfun(row.a, row.b), axis=1)
The resulting pandas dataframe looks like:
print(df)
a b x y
0 1 4 5 -3
1 2 5 7 -3
2 3 6 9 -3
However, if I try to add two more columns,
df[['xx','yy']] = df.apply(lambda row: myfun(row.a, row.b), axis=1)
I get the error message,
KeyError: "['xx' 'yy'] not in index"
How come? And what is the correct way to do this?
Many thanks!
//A
Upvotes: 1
Views: 5054
Reputation: 164613
You can assign to a tuple of series:
df['xx'], df['yy'] = df.apply(lambda row: myfun(row.a, row.b), axis=1)
But this is inefficient versus direct assignment: don't use pd.DataFrame.apply
unless you absolutely must, it's just a fancy loop.
df['xx'] = df['a'] + df['b']
df['yy'] = df['a'] - df['b']
Upvotes: 0
Reputation: 862431
Need convert return output to Series
:
def myfun(a, b):
return pd.Series([a + b, a - b])
df[['x', 'y']] = df.apply(
lambda row: myfun(row.a, row.b), axis=1)
print (df)
a b x y
0 1 4 5 -3
1 2 5 7 -3
2 3 6 9 -3
Upvotes: 2