Reputation: 3433
Assuming a pandas Dataframe as:
import pandas as pd
df = pd.DataFrame({'col1':[1,2,3,4,5,6],'col2':[11,22,33,44,55,66],'col3':[11,222,333,444,555,666]})
and a function like:
def sumit (a,b, name='me', password='pass'):
# pseudocode: open database with name 'me' and password 'pass'
# make queries and get c
c= 4
return a+b+c
How can I create an extra column 'colSUM' to apply the addition of columns col1 and col2 and the value c passing in the df as well the name and password as variables.
I can not figure it out from:
Passing a function with multiple arguments to DataFrame.apply
(applying vectorized logic)
or:
https://www.journaldev.com/33478/pandas-dataframe-apply-examples
Be aware that name and password can not be vectorized otherwise the connection with the database would not work.
before going with for loops I decided to give a try here!
The code I thought would work is:
df['new'] = df[['col1','col2']].apply(sumit, kwargs=(name='me', password='pass'))
nor this:
df['new'] = df[['col1','col2']].apply(sumit, name='me', password='pass')
nor:
df['new'] = df.apply(sumit, df['col1'], df['col2'], name='me', password='pass')
any idea? thanks
Upvotes: 0
Views: 76
Reputation: 35646
The main issue is that the function apply
uses passes the entire Series
as one argument.
Instead of sumit(a, b, ...)
It should be sumit(s,...
where s
is the Series
.
Then unpack the series in the function:
a, b = s
or
a = s['col1']
b = s['col2']
Second, it appears to be a row computation so use axis=1
instead of the apply
's default axis=0
.
import pandas as pd
df = pd.DataFrame({
'col1': [1, 2, 3, 4, 5, 6],
'col2': [11, 22, 33, 44, 55, 66],
'col3': [11, 222, 333, 444, 555, 666]
})
def sumit(s, name='me', password='pass'):
a, b = s
c = 4
return a + b + c
df['new'] = df[['col1', 'col2']].apply(sumit, name='me', password='pass', axis=1)
print(df)
df
:
col1 col2 col3 new
0 1 11 11 16
1 2 22 222 28
2 3 33 333 40
3 4 44 444 52
4 5 55 555 64
5 6 66 666 76
Upvotes: 1
Reputation: 14949
TRY:
def sumit (x, name='me', password='pass'):
c= 4
return x['col1']+x['col2']+c
df['new'] = df[['col1','col2']].apply(sumit, name='me', password='pass', axis=1)
OUTPUT:
col1 col2 col3 new
0 1 11 11 16
1 2 22 222 28
2 3 33 333 40
3 4 44 444 52
4 5 55 555 64
5 6 66 666 76
Upvotes: 1