Lambda data frame reference a value in another column

Question

How do I correctly reference another column value when using a Lambda in a pandas dataframe.

dfresult_tmp2['Retention_Rolling_temp'] = dfresult_tmp2['Retention_tmp'].apply(lambda x: x if x['Count Billings']/4 < 0.20 else '')

The above code gives me this error.

TypeError: 'float' object is not subscriptable

piRSquared · Accepted Answer

dfresult_tmp2['Retention_tmp'].apply(
    lambda x: x if x['Count Billings'] / 4 < 0.20 else ''
)

You are using pd.Series.apply which is different than pd.DataFrame.apply. In this case, you are iteratively passing a scalar value to the lambda. So some_scalar_x['Count Billings'] makes no sense.

Instead of telling you how to shoehorn your logic into an apply, I'll show you the vectorized versions instead

Option 1
pd.Series.where

dfresult_tmp2['Retention_tmp'] = \
    dfresult_tmp2['Retention_tmp'].where(
        dfresult_tmp2['Count Billings'] / 4 < .2, '')

Option 2
np.where

r = dfresult_tmp2['Retention_tmp'].values
b = dfresult_tmp2['Count Billings'].values
dfresult_tmp2['Retention_tmp'] = np.where(b / 4 < .2, r, '')

Option 3
apply
What you asked for but not what I'd recommend.

dfresult_tmp2['Retention_tmp'] = dfresult_tmp2.apply(
    lambda x: x['Retention_tmp'] if x['Count Billings'] / 4 < .2 else '',
    axis=1
)

Lambda data frame reference a value in another column

Answers (1)

Related Questions