Reputation: 154
I'm trying to apply this function to fill the Age
column based on Pclass
and Sex
columns. But I'm unable to do so. How can I make it work?
def fill_age():
Age = train['Age']
Pclass = train['Pclass']
Sex = train['Sex']
if pd.isnull(Age):
if Pclass == 1:
return 34.61
elif (Pclass == 1) and (Sex == 'male'):
return 41.2813
elif (Pclass == 2) and (Sex == 'female'):
return 28.72
elif (Pclass == 2) and (Sex == 'male'):
return 30.74
elif (Pclass == 3) and (Sex == 'female'):
return 21.75
elif (Pclass == 3) and (Sex == 'male'):
return 26.51
else:
pass
else:
return Age
train['Age'] = train['Age'].apply(fill_age(),axis=1)
I'm getting the following error:
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
Upvotes: 0
Views: 580
Reputation: 18367
You should consider using parenthesis to separate the arguments (which you already did) and change the boolean operator and
for bitwise opeator &
to avoid this type of errors. Also, keep in mind that if you want to use apply
then you should use a parameter x
for the function which will part of a lambda in the apply
function:
def fill_age(x):
Age = x['Age']
Pclass = x['Pclass']
Sex = x['Sex']
if pd.isnull(Age):
if Pclass == 1:
return 34.61
elif (Pclass == 1) & (Sex == 'male'):
return 41.2813
elif (Pclass == 2) & (Sex == 'female'):
return 28.72
elif (Pclass == 2) & (Sex == 'male'):
return 30.74
elif (Pclass == 3) & (Sex == 'female'):
return 21.75
elif (Pclass == 3) & (Sex == 'male'):
return 26.51
else:
pass
else:
return Age
Now, using apply with the lambda:
train['Age'] = train['Age'].apply(lambda x: fill_age(x),axis=1)
In a sample dataframe:
df = pd.DataFrame({'Age':[1,np.nan,3,np.nan,5,6],
'Pclass':[1,2,3,3,2,1],
'Sex':['male','female','male','female','male','female']})
Using the answer provided above:
df['Age'] = df.apply(lambda x: fill_age(x),axis=1)
Output:
Age Pclass Sex
0 1.00 1 male
1 28.72 2 female
2 3.00 3 male
3 21.75 3 female
4 5.00 2 male
5 6.00 1 female
Upvotes: 1