Bussiere
Bussiere

Reputation: 1144

create new column with data in a column

So here is my data in pandas

      Movie        Tags
0  War film  tank;plane
1  Spy film   car;plane

i would like to create new column with the tag column with 0 and 1 and add a prefix like 'T_' to the name of the columns.

Like :

      Movie        Tags T_tank T_plane T_car
0  War film  tank;plane      1       1     0
1  Spy film   car;plane      0       1     1

I have some ideas on how to do it like line by line with a split(";") and a df.loc[:,'T_plane'] for example. But i think that may not be the optimal way to do it.

Regards

Upvotes: 2

Views: 416

Answers (2)

jpp
jpp

Reputation: 164613

Using the sklearn library:

from sklearn.preprocessing import MultiLabelBinarizer

mlb = MultiLabelBinarizer()

res = df.join(pd.DataFrame(mlb.fit_transform(df['Tags'].str.split(';')),
                           columns=mlb.classes_).add_prefix('T_'))

print(res)

      Movie        Tags  T_car  T_plane  T_tank
0  War film  tank;plane      0        1       1
1  Spy film   car;plane      1        1       0

Upvotes: 2

ALollz
ALollz

Reputation: 59519

With .str.get_dummies

df.join(df.Tags.str.get_dummies(';').add_prefix('T_'))

      Movie        Tags  T_car  T_plane  T_tank
0  War film  tank;plane      0        1       1
1  Spy film   car;plane      1        1       0

Upvotes: 1

Related Questions