Reputation: 39
I have a column with 15000 arrays. please find sample of 2 such records out of 15000. I want to create dummies for the values in under Genres_relevant.
user Genres_relevant
1 [2.0]
2 [3.0,2.0,1.0]
Code:
from sklearn.preprocessing import MultiLabelBinarizer
df=pd.DataFrame(users_list['Genres_relevant'])
mlb = MultiLabelBinarizer()
pd.DataFrame(mlb.fit_transform(df),columns=mlb.classes_, index=df.index)
Expected output
1.0 2.0 3.0
1 0 1 0
2 1 1 1
Error: The shape of passed values is (12, 1), indices imply (12, 15000)
Upvotes: 1
Views: 397
Reputation: 16916
pd.DataFrame(mlb.fit_transform(df['Genres_relevant']), columns=mlb.classes_,
index=df.index)
When you are fitting do not pass in the full dataframe but rather pass in the column.
Upvotes: 2