Nazim Kerimbekov
Nazim Kerimbekov

Reputation: 4783

How to add LabelBinarizer columns to DataFrame

I have recently started working with LabelBinarizer by running the following code. (here are the first couple of rows of the CSV file that I'm using):

import pandas as pd
from sklearn.preprocessing import LabelBinarizer
#import matplotlib.pyplot as plot


#--------------------------------

label_conv = LabelBinarizer()
appstore_original = pd.read_csv("AppleStore.csv")

#--------------------------------

lb_conv = label_conv.fit_transform(appstore["cont_rating"])
column_names = label_conv.classes_

print(column_names)        
print(lb_conv)

I get the lb_conv and the column names. Therefore:

how could I attach label_conv to appstore_original using column_names as the column names?

If anyone could help that would be great.

Upvotes: 1

Views: 4486

Answers (1)

MaxU - stand with Ukraine
MaxU - stand with Ukraine

Reputation: 210982

try this:

lb = LabelBinarizer()

df = pd.read_csv("AppleStore.csv")

df = df.join(pd.DataFrame(lb.fit_transform(df["cont_rating"]),
                          columns=lb.classes_, 
                          index=df.index))

to make sure that a newly created DF will have the same index elements as the original DF (we need it for joining), we will specify index=df.index in the constructor call.

Upvotes: 4

Related Questions