Curious
Curious

Reputation: 107

pandas map function returning 'NaN'

Relevant DataFrame: http://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.data

I have manually added a 'sex' column onto the DataFrame, and I am trying to replace 'Male' with 0 and 'Female' with 1 however it does not seem to work. I just get a 'NaN' value instead of the ones and zeroes.

Relevant code:

df['sex'] = df['sex'].map({'Male': 0, 'Female': 1})

It does not seem to be specific to the 'sex' column since this does not work either:

df['success'] = df['success'].map({'<=50K': 0, '>50k':1})

Any thoughts?

Upvotes: 6

Views: 14334

Answers (4)

Shima Moghtasedi
Shima Moghtasedi

Reputation: 11

I suggest to try run

df['success'] = df['success'].str.replace(r'[\xa0]',"")  

before map function.

Upvotes: 1

Louis Rossi
Louis Rossi

Reputation: 11

d1 = {'UK': 0, 'USA': 1, 'N': 2}
df['Nationality'] = df['Nationality'].map(d1)
d2 = {'YES': 1, 'NO': 0}
df['Go'] = df['Go'].map(d2)

Decision Tree practice from W3Schools... That originally didn't work for me until I restarted my kernels and ran all. Give it another shot.

Upvotes: 0

cresclux
cresclux

Reputation: 76

Similar to what @Leb has mentioned, this could also happen with pandas read_table. By default skipinitialspace is set to False in read_table as well. Hence, using skipinitialspace=True, will let you solve this problem when using read_table.

df = pd.read_table('smsspamcollection/SMSSpamCollection','\t',names = ['label', 'sms_message'],skipinitialspace=True)

Upvotes: 2

Leb
Leb

Reputation: 15953

@ayhan is correct, the white space is causing the problem. A more proper fix to that could be to add skipinitialspace which is set to False by default as you're reading the data with read_csv.

df = pd.read_csv(io.StringIO(data), delimiter=',', skipinitialspace=True, header=None )
df[9] = df[9].map({'Male': 0, 'Female': 1})

Will give us (column 9 being the "gender" column):

   0                 1       2          3   4                   5   \
0  39         State-gov   77516  Bachelors  13       Never-married   
1  50  Self-emp-not-inc   83311  Bachelors  13  Married-civ-spouse   
2  38           Private  215646    HS-grad   9            Divorced   

                  6              7      8   9     10  11  12             13  \
0       Adm-clerical  Not-in-family  White   0  2174   0  40  United-States   
1    Exec-managerial        Husband  White   0     0   0  13  United-States   
2  Handlers-cleaners  Not-in-family  White   0     0   0  40  United-States   

      14  
0  <=50K  
1  <=50K  
2  <=50K  

Upvotes: 7

Related Questions