June
June

Reputation: 345

Pass Dataframe Row Values Iteratively (loop) as Arguments in a Function in Python

I have a function that requires three arguments:

def R0(confirm, suspect,t):
    p = 0.695
    si = 7.5
    yt = suspect * p + confirm
    lamda = math.log(yt)/t
    R0 = 1 + lamda * si + p * (1 - p) * pow(lamda * si,2)   
    return R0

And a dataframe with three columns:

data = {'confirm':  ['41', '41', '43', '44'],
        'suspect': ['0', '0', '0', '10'],
        't': ['0', '1', '2', '3']
        }

df = pd.DataFrame (data, columns = ['confirm','suspect', 't'])

I would like to use each row (with three columns, and hence three values) as the argument values for the function. Finally, I would like to loop over rows of the dataframe and return a list.

For instance, the results should look like:

result = [R0_Value1, R0_Value2, R0_Value3, ....] where
R0_Value1 = R0(41, 0, 0)
R0_Value2 = R0(41, 0, 1)
R0_Value3 = R0(43, 0, 2)
...

I figure out it probably has something to do with pandas.DataFrame.apply and *. But I am new to Python and could not figure out how to do it. Could someone please help?

Upvotes: 2

Views: 5925

Answers (4)

Georgina Skibinski
Georgina Skibinski

Reputation: 13387

You can do:

df["formula"]=df.apply(lambda x: R0(*x), axis=1)

The whole thing (there were couple of other things in need of polishing):

import pandas as pd
import math

def R0(confirm, suspect,t):
    p = 0.695
    si = 7.5
    yt = suspect * p + confirm
    lamda = math.log(yt)/max(t,1) #you need to handle division by 0 somehow
    R= 1 + lamda * si + p * (1 - p) * math.pow((lamda * si),2)
    return R

data = {'confirm':  ['41', '41', '43', '44'],
        'suspect': ['0', '0', '0', '10'],
        't': ['0', '1', '2', '3']
        }

df = pd.DataFrame(data, columns = ['confirm','suspect', 't']).astype(int) #note it has to be numeric to conduct all the arithmetics you are doing later

df["formula"]=df.apply(lambda x: R0(*x), axis=1)

Outputs:

   confirm  suspect  t     formula
0       41        0  0  193.285511
1       41        0  1  193.285511
2       43        0  2   57.274157
3       44       10  3   31.297989

Upvotes: 2

Henry Yik
Henry Yik

Reputation: 22493

If you insist of using pandas, you can also do the calculations directly using numpy without a function:

df = pd.DataFrame (data, columns = ['confirm','suspect', 't']).astype(int)

p = 0.695
si = 7.5

df['results'] = 1 +(np.log(df["suspect"]*p + df["confirm"])/df["t"])*si \
                  + p*(1-p)*np.power((np.log(df["suspect"]*p + df["confirm"])/df["t"])*si,2)
print (df)

#
   confirm  suspect  t     results
0       41        0  0         inf
1       41        0  1  193.285511
2       43        0  2   57.274157
3       44       10  3   31.297989

Upvotes: 1

Araldo van de Kraats
Araldo van de Kraats

Reputation: 304

You were looking in the right direction with 'apply':

# Convert values to int (now strings, which will throw an error in R0)
df = df.applymap(int)

df['results'] = df.apply(lambda x: R0(x.confirm, x.suspect, x.t), axis=1)

What happens when you use the apply function is that (in case of axis=1) the whole row is used as the first argument in the specified function. The lambda function is basically a wrapper that transforms this single argument (x) into the three unpacked values and passes them in the correct order to the next function, R0.

Upvotes: 3

June
June

Reputation: 345

df.apply(lambda x: R0(x[0], x[1], x[2]), axis=1) will give the right result.

Upvotes: 0

Related Questions