Mapping a function to a dataframe

Question

I was trying to apply a function to a dataframe in pandas. I am trying to take two columns as positional arguments and map a function to it. Below is the code I tried. Code:

df_a=pd.read_csv('5_a.csv')
def y_pred(x):
    if x<.5:
        return 0
    else:
        return 1
df_a['y_pred']=df_a['proba'].map(y_pred)
def confusion_matrix(act,pred):
    if act==1 and act==pred:
        return 'TP'
    elif act==0 and act==pred:
        return 'TN'
    elif act==0 and pred==1:
        return 'FN'
    elif act==1 and pred==0:
        return 'FP'
df_a['con_mat_label']=df_a[['y','y_pred']].apply(confusion_matrix)

But the function is not considering y_pred as the second column and mapping it to pred variable in the defined function. I am gettting this error: TypeError: ("confusion_matrix() missing 1 required positional argument: 'pred'", 'occurred at index y')

abhilb · Accepted Answer

What you get as argument in the function that you pass as part of apply method is a pandas series and using the axis argument you can specify if has to be a row or a column.

So you need to modify your confusion_matrix function to

I am assuming that the act corresponds to the column name y here*

def confusion_matrix(row):
    if row.y==1 and row.y==row.y_pred:
        return 'TP'
    elif row.y==0 and row.y==row.y_pred:
        return 'TN'
    elif row.y==0 and row.y_pred==1:
        return 'FN'
    elif row.y==1 and row.y_pred==0:
        return 'FP'

And you need to modify your apply call to

df_a['con_mat_label']=df_a[['y','y_pred']].apply(confusion_matrix, axis=1)

Now let me give you some tips on how you could improve your code.

Say you have a data frame like this:

To add a Y_pred column

>>> df['Y_pred'] = (df.X < 3).astype(int)
>>> df
   X  Y  Y_pred
0  1  4       1
1  2  5       1
2  3  6       0
3  4  7       0

Oh btw, I would like you to refer you to this interesting blog post

Mapping a function to a dataframe

Answers (2)

Related Questions