Reputation: 1353
I'm trying to create a new column that comes from the calculation of two columns. Usually when I need to do this but with only one column I use .apply()
but now with two parameters I don't know how to do it.
With one I do the following code:
from pandas import read_csv, DataFrame
df = read_csv('results.csv')
def myFunc(x):
x = x + 5
return x
df['new'] = df['colA'].apply(myFunc)
df.head()
With two I thought was like the following, but not.
from pandas import read_csv, DataFrame
df = read_csv('results.csv')
def myFunc(x,y):
x = x + y
return x
df['new'] = df[['colA','colB']].apply(myFunc)
df.head()
I see some people use lambda
but I don't understand and furthermore I think has to be easier.
Thank you very much!
Upvotes: 0
Views: 470
Reputation: 221
Get knowledge of using lambda from here
lambda function is an expression https://realpython.com/python-lambda/
The special syntax *args in function definitions in python is used to pass a variable number of arguments to a function
https://www.geeksforgeeks.org/args-kwargs-python/
from pandas import read_csv, DataFrame
df = read_csv('results.csv')
def myFunc(x,y):
return x + y
df['new'] = df[['colA','colB']].apply(lambda col: myFunc(*col) ,axis=1)
df.head()
Upvotes: 1
Reputation: 150785
Disclaimer: avoid apply
if possible. With that in mind, you are looking for axis=1
, but you need to rewrite the function like:
df['new'] = df.apply(lambda x: myFunc(x['colA'], x['colB']),
axis=1)
which is essentially equivalent to:
df['new'] = [myFunc(x,y) for x,y in zip(df['colA'], df['colB'])]
Upvotes: 2
Reputation: 20445
You can use axis=1
and in function access columns like below
def myFunc(x):
x['colA']
x['colB']
and you apply it as
df['new'] = df.apply(myFunc, axis=1)
Upvotes: 1