JFerro
JFerro

Reputation: 3433

Applying a function to several pandas columns and extra integers arguments

Assuming a pandas Dataframe as:

import pandas as pd
df = pd.DataFrame({'col1':[1,2,3,4,5,6],'col2':[11,22,33,44,55,66],'col3':[11,222,333,444,555,666]})

and a function like:

def sumit (a,b, name='me', password='pass'):
  # pseudocode: open database with name 'me' and password 'pass' 
  # make queries and get c
  c= 4
  return a+b+c

How can I create an extra column 'colSUM' to apply the addition of columns col1 and col2 and the value c passing in the df as well the name and password as variables.

I can not figure it out from: Passing a function with multiple arguments to DataFrame.apply
(applying vectorized logic) or: https://www.journaldev.com/33478/pandas-dataframe-apply-examples

Be aware that name and password can not be vectorized otherwise the connection with the database would not work.

before going with for loops I decided to give a try here!

The code I thought would work is:

df['new'] = df[['col1','col2']].apply(sumit, kwargs=(name='me', password='pass'))
nor this:
df['new'] = df[['col1','col2']].apply(sumit, name='me', password='pass')
nor:
df['new'] = df.apply(sumit, df['col1'], df['col2'], name='me', password='pass')

any idea? thanks

Upvotes: 0

Views: 76

Answers (2)

Henry Ecker
Henry Ecker

Reputation: 35646

The main issue is that the function apply uses passes the entire Series as one argument.

Instead of sumit(a, b, ...) It should be sumit(s,... where s is the Series.

Then unpack the series in the function:

a, b = s

or

a = s['col1']
b = s['col2']

Second, it appears to be a row computation so use axis=1 instead of the apply's default axis=0.

import pandas as pd

df = pd.DataFrame({
    'col1': [1, 2, 3, 4, 5, 6],
    'col2': [11, 22, 33, 44, 55, 66],
    'col3': [11, 222, 333, 444, 555, 666]
})


def sumit(s, name='me', password='pass'):
    a, b = s
    c = 4
    return a + b + c


df['new'] = df[['col1', 'col2']].apply(sumit, name='me', password='pass', axis=1)

print(df)

df:

   col1  col2  col3  new
0     1    11    11   16
1     2    22   222   28
2     3    33   333   40
3     4    44   444   52
4     5    55   555   64
5     6    66   666   76

Upvotes: 1

Nk03
Nk03

Reputation: 14949

TRY:

def sumit (x, name='me', password='pass'):
    c= 4
    return x['col1']+x['col2']+c

df['new'] = df[['col1','col2']].apply(sumit, name='me', password='pass', axis=1)

OUTPUT:

   col1  col2  col3  new
0     1    11    11   16
1     2    22   222   28
2     3    33   333   40
3     4    44   444   52
4     5    55   555   64
5     6    66   666   76

Upvotes: 1

Related Questions