Reputation: 1353

Use two columns as inputs - Pandas

I'm trying to create a new column that comes from the calculation of two columns. Usually when I need to do this but with only one column I use .apply() but now with two parameters I don't know how to do it.

With one I do the following code:

from pandas import read_csv, DataFrame

df = read_csv('results.csv')

def myFunc(x):
  x = x + 5
  return x

df['new'] = df['colA'].apply(myFunc)

df.head()

With two I thought was like the following, but not.

from pandas import read_csv, DataFrame

df = read_csv('results.csv')

def myFunc(x,y):
  x = x + y
  return x

df['new'] = df[['colA','colB']].apply(myFunc)

df.head()

I see some people use lambda but I don't understand and furthermore I think has to be easier.

Thank you very much!

Upvotes: 0

Answers (3)

Yuvaraja

Reputation: 221

Get knowledge of using lambda from here

lambda function is an expression https://realpython.com/python-lambda/

The special syntax *args in function definitions in python is used to pass a variable number of arguments to a function

https://www.geeksforgeeks.org/args-kwargs-python/

from pandas import read_csv, DataFrame

df = read_csv('results.csv')

def myFunc(x,y):
  return x + y

df['new'] = df[['colA','colB']].apply(lambda col: myFunc(*col) ,axis=1)

df.head()

Upvotes: 1

Quang Hoang

Reputation: 150785

Disclaimer: avoid apply if possible. With that in mind, you are looking for axis=1, but you need to rewrite the function like:

df['new'] = df.apply(lambda x: myFunc(x['colA'], x['colB']), 
                     axis=1)

which is essentially equivalent to:

df['new'] = [myFunc(x,y) for x,y in zip(df['colA'], df['colB'])]

Upvotes: 2

A.B

Reputation: 20445

You can use axis=1 and in function access columns like below

def myFunc(x):
    x['colA']
    x['colB']

and you apply it as

 df['new'] = df.apply(myFunc, axis=1)

Upvotes: 1

Use two columns as inputs - Pandas

Answers (3)

Related Questions