Wojciech Moszczyński
Wojciech Moszczyński

Reputation: 3187

How to transform string into binary records?

I have such base here.

df = pd.read_csv('c:/1/Autism_Data.arff',na_values="?")

enter image description here

I need to transform columns: "gender", "jundice", "austim" into binar records 0-1. I would like to see this table like that. enter image description here

Upvotes: 0

Views: 417

Answers (2)

furas
furas

Reputation: 143097

You can map() values with df['gender'].map({'f':1, 'm':0})

import pandas as pd

df = pd.DataFrame({
    'gender':['f','m','m','f', 'f'],
    'jundice':['no','no','yes','no','no'],
    'austim':['no','yes','yes','yes','no'],
})
#print(df)

df['gender'] = df['gender'].map({'f':1, 'm':0})
df['jundice'] = df['jundice'].map({'yes':1, 'no':0})
df['austim'] = df['austim'].map({'yes':1, 'no':0})

print(df)

Result:

   gender  jundice  austim
0       1        0       0
1       0        0       1
2       0        1       1
3       1        0       1
4       1        0       0

Upvotes: 1

modesitt
modesitt

Reputation: 7210

If you'd like to be brief you can use pd.Categorical. For example,

df['gender'] = pd.Categorical(df.gender).codes

you can extend this to the other desired columns. These will assign the numbers alphabetically - so you ought to pay attention to that and mask otherwise desired results. Alternatively, if you would like some more control you can use LabelEncoder.

sklearn.preprocessing import LabelEncoder

le = LabelEncoder()
df['gender'] = le.fit_transform(df.gender)

Upvotes: 2

Related Questions