merklexy
merklexy

Reputation: 79

Replace the values in column

This is my dataset in which I have four columns. I want to replace the values (1's and 2's) in survival_status column into (Negative and Postive). I am using the pandas to change the values.

 Age  operation_year  axillary_nodes_detected  survival_status
0   30              64                        1                1
1   30              62                        3                1
2   30              65                        0                2
3   31              59                        2                1
4   31              65                        4                2

Haberman["survival_status"] = Haberman["survival_status"].apply(lambda x : 'Positive' if x == 2 else 'Negative')

After applying that, It is changing the entire column values to Negative.

Haberman['survival_status'].value_counts()
Negative    306
Name: survival_status, dtype: int64

Could anyone tell me where am I doing wrong?

Upvotes: 1

Views: 47

Answers (2)

jezrael
jezrael

Reputation: 862771

Better solution is use numpy.where and convert column to integer first:

Haberman["survival_status"] = np.where(Haberman["survival_status"].astype(int) == 2,
                                       'Positive','Negative')

Upvotes: 1

jpp
jpp

Reputation: 164693

One way is to use a dictionary mapping. But first make sure your dataframe is converted to int:

df = df.astype(int)

d = {2: 'Positive', 1: 'Negative'}

df['survival_status'] = df['survival_status'].map(d)

Result:

print(df)

   Age  operation_year  axillary_nodes_detected survival_status
0   30              64                        1        Negative
1   30              62                        3        Negative
2   30              65                        0        Positive
3   31              59                        2        Negative
4   31              65                        4        Positive

Upvotes: 1

Related Questions