Reputation: 39
import pandas as pd
from pandas import DataFrame,Series
import numpy as np
titanic=pd.read_csv('C:/Users/prasun.j/Downloads/train.csv')
sex=[]
if titanic['Sex']=='male':
sex.append(1)
else:
sex.append(0)
sex
I m trying to a list which should be append by 1 when if statement encounters male or 0 when it encounters female,I dont know what I m doing wrong,can someone helpout,thanks in advance,execution throws following error
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-6-265768ba34be> in <module>()
4 titanic=pd.read_csv('C:/Users/prasun.j/Downloads/train.csv')
5 sex=[]
----> 6 if titanic['Sex']=='male':
7 sex.append(1)
8 else:
C:\anaconda\lib\site-packages\pandas\core\generic.pyc in __nonzero__(self)
1119 raise ValueError("The truth value of a {0} is ambiguous. "
1120 "Use a.empty, a.bool(), a.item(), a.any() or a.all()."
-> 1121 .format(self.__class__.__name__))
1122
1123 __bool__ = __nonzero__
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
Upvotes: 1
Views: 78
Reputation: 12417
You could also use get_dummies
dropping the first column(in this case dropping female
):
df = pd.DataFrame({'sex': ['male', 'female', 'male', 'male', 'female','male'], 'age':[10,20,30,40,50,60]})
use pd.get_dummies
to obtain your values:
sex = pd.get_dummies(df['sex'],drop_first=True)
sex
male
0 1
1 0
2 1
3 1
4 0
5 1
And then convert to a list:
list_sex = sex['male'].tolist()
list_sex
[1, 0, 1, 1, 0, 1]
Upvotes: 0
Reputation: 51155
When you check if titanic['Sex']=='male'
, you are comparing male
to the entire Series, which is why you get your ValueError
.
If you really wanted to continue with an iterative approach, you could use iterrows
, and check your condition for each row. However, you should avoid iteration with Pandas, and here there is a much cleaner solution.
Setup
df = pd.DataFrame({'sex': ['male', 'female', 'male', 'male', 'female']})
Just use np.where
here:
np.where(df.sex == 'male', 1, 0)
# array([1, 0, 1, 1, 0])
You could also use boolean indexing:
(df.sex == 'male').astype(int).values.tolist()
# [1, 0, 1, 1, 0]
Upvotes: 2