Reputation: 143
I have user review dataset like
review-1, 0,1,1,0,0
review-1
is user review and 0,1,1,0,0
is review categories. one review can have multiple categories. I want to predict categories to reviews. so I implement the code that
transformer = TfidfVectorizer(lowercase=True, stop_words=stop, max_features=500)
X = transformer.fit_transform(df.Review)
X_train, X_test, y_train, y_test = train_test_split(X, df.iloc[:, 1:6],
test_size=0.25, random_state=42)
SVM = svm.SVC()
SVM.fit(X_train, y_train)
But I'm getting error like
ValueError: bad input shape (75, 5)
Could anyone suggest any good solution to solve this?
Upvotes: 1
Views: 652
Reputation: 16966
You could use a binary classifier (like svm.SVC()
) to solve the multi-label classification problem using OneVsRestClassifier
.
Example:
from sklearn.multiclass import OneVsRestClassifier
from sklearn.svm import SVC
cls = OneVsRestClassifier(estimator=SVC(gamma ='auto'))
import numpy as np
cls.fit(np.random.rand(20,10),np.random.binomial(1,0.2,size=(20,5)))
Upvotes: 5