User 19826
User 19826

Reputation: 599

Mapping to a new value changed all to NaN

Originally, I have dataframe df1 which contains a gender column, with values Female and Male. Since I want to work with a temp dataframe, I first copied it. See the code:

df2 = df1
gMap = {'Female': 1, 'Male': 0}
df2['sex']=df2['sex'].map(gMap)

2 problems have occurred:

An a final question is, how to change the column data type along with mapping, for example, in above, to integer.

Upvotes: 1

Views: 210

Answers (1)

jezrael
jezrael

Reputation: 863226

First for new DataFrame is necessary DataFrame.copy for avoid reference to original DataFrame, so changing the df1 will avoid change the df2.

If no match possible problem are trailing whitespaces, so remove them by Series.str.strip:

df2 = df1.copy()

print (df2['sex'].unique())

gMap = {'Female': 1, 'Male': 0}
df2['sex']=df2['sex'].str.strip().map(gMap)

Will this change the datatype as well?

It depends.

If all unique values in columns are only Female or Male (keys in dictionary) then is created new integer column:

df2 = pd.DataFrame({'sex':['Male','Female','Male']})

gMap = {'Female': 1, 'Male': 0}
df2['sex']=df2['sex'].map(gMap)

print (df2)
   sex
0    0
1    1
2    0

print (df2.dtypes)
sex    int64
dtype: object

If there is more values, get float column, because non matched values return missing values:

df2 = pd.DataFrame({'sex':['Male','Female','Another Val']})

gMap = {'Female': 1, 'Male': 0}
df2['sex']=df2['sex'].map(gMap)

print (df2)
   sex
0  0.0
1  1.0
2  NaN

print (df2.dtypes)
sex    float64
dtype: object

Upvotes: 2

Related Questions