Reputation: 79
This is my dataset in which I have four columns. I want to replace the values (1's and 2's) in survival_status column into (Negative and Postive). I am using the pandas to change the values.
Age operation_year axillary_nodes_detected survival_status
0 30 64 1 1
1 30 62 3 1
2 30 65 0 2
3 31 59 2 1
4 31 65 4 2
Haberman["survival_status"] = Haberman["survival_status"].apply(lambda x : 'Positive' if x == 2 else 'Negative')
After applying that, It is changing the entire column values to Negative.
Haberman['survival_status'].value_counts()
Negative 306
Name: survival_status, dtype: int64
Could anyone tell me where am I doing wrong?
Upvotes: 1
Views: 47
Reputation: 862771
Better solution is use numpy.where
and convert column to integer
first:
Haberman["survival_status"] = np.where(Haberman["survival_status"].astype(int) == 2,
'Positive','Negative')
Upvotes: 1
Reputation: 164693
One way is to use a dictionary mapping. But first make sure your dataframe is converted to int
:
df = df.astype(int)
d = {2: 'Positive', 1: 'Negative'}
df['survival_status'] = df['survival_status'].map(d)
Result:
print(df)
Age operation_year axillary_nodes_detected survival_status
0 30 64 1 Negative
1 30 62 3 Negative
2 30 65 0 Positive
3 31 59 2 Negative
4 31 65 4 Positive
Upvotes: 1