xxyyzz
xxyyzz

Reputation: 21

apply with lambda and apply without lambda

I am trying to use the impliedVolatility function in df_spx.apply() while hardcoding the variable inputs S, K, r, price, T, payoff, and c_or_p.

However, it does not work, using the same function impliedVolatility, only doing lambda + apply it works.

[code link][1]

# first version of code 

S = SPX_spot
K = df_spx['strike_price']
r = df_spx['r']
price = df_spx['mid_price']
T = df_spx['T_years']
payoff = df_spx['cp_flag']
c_or_p = df_spx["cp_flag"]

df_spx["iv"] = df_spx.apply(impliedVolatility(c_or_p, S, K, T, r,price),axis=1)


# second version of code 

df_spx["impliedvol"] = df_spx.apply(
    lambda r: impliedVolatility(r["cp_flag"],
                                S,
                                r["strike_price"],
                                r['T_years'],    
                                r["r"],
                                r["mid_price"]), 
    axis = 1)
[1]: https://i.sstatic.net/yBfO5.png

Upvotes: 1

Views: 103

Answers (2)

alparslan mimaroğlu
alparslan mimaroğlu

Reputation: 1490

You have to give apply a function that it can call. It needs a callable function. In your first example

df_spx.apply(impliedVolatility(c_or_p, S, K, T, r,price), axis=1)

you are giving the result of the function as a parameter to apply. That would not work. If you instead wrote

df_spx.apply(impliedVolatility, c_or_p=c_or_p, S=S, K=K, T=T, r=r, price=price, axis=1)

if the function keywords arguments have the same names or if you wrote

df_spx.apply(impliedVolatility, args=(c_or_p, S, K, T, r,price), axis=1)

then it might work. Notice we are not calling the impliedVolatility in the apply. We are giving the function as a argument.

Upvotes: 3

deponovo
deponovo

Reputation: 1432

There is already a pretty good answer, but maybe to give it a different perspective. The apply is going to loop on your data and call the function you provide on it.

Say you have:

import pandas as pd
df = pd.DataFrame({"a": [1, 2, 3], "b": list("asd")})
df
Out: 
   a  b
0  1  a
1  2  s
2  3  d

If you want to create new data or perform certain work on any of the columns (you could also do it at the entire row level, which btw is your use case, but let's simplify for now) you might consider using apply. Say you just wanted to multiply every input by two:

def multiply_by_two(val):
    return val * 2

df.b.apply(multiply_by_two)  # case 1
Out: 
0    aa
1    ss
2    dd

df.a.apply(multiply_by_two)  # case 2
Out: 
0    2
1    4
2    6

The first usage example transformed your one letter string into two equal letter strings while the second is obvious. You should avoid using apply in the second case, because it is a simple mathematical operation that will be extremely slow in comparison to df.a * 2. Hence, my rule of thumb is: use apply when performing operations with non-numeric objects (case 1). NOTE: no actual need for a lambda in this simple case.

So what apply does is passing each element of the series to the function. Now, if you apply on an entire dataframe, the values passed will be a data slice as a series. Hence, to properly apply your function you will need to map the inputs. For, instance:

def add_2_to_a_multiply_b(b, a):
    return (a + 2) * b

df.apply(lambda row: add_2_to_a_multiply_b(*row), axis=1)  # ERROR because the values are unpacked as (df.a, df.b) and you can't add integers and strings (see `add_2_to_a_multiply_b`)

df.apply(lambda row: add_2_to_a_multiply_b(row['b'], row['a']), axis=1)
Out: 
0      aaa
1     ssss
2    ddddd

From this point on you can build more complex implementation, for instance, using partial functions, etc. For instance:

def add_to_a_multiply_b(b, a, *, val_to_add):
    return (a + val_to_add) * b

import partial
specialized_func = partial(add_to_a_multiply_b, val_to_add=2)
df.apply(lambda row: specialized_func(row['b'], row['a']), axis=1)

Just to stress it again, avoid apply if you are performance eager:

# 'OK-ISH', does the job... but
def strike_price_minus_mid_price(strike_price, mid_price):
    return strike_price - mid_price

new_data = df.apply(lambda r: strike_price_minus_mid_price(r["strike_price"], r["mid_price"] ), axis=1)

vs

'BETTER'
new_data = df["strike_price"] - df["mid_price"] 

Upvotes: 1

Related Questions