Reputation: 174
I'm new to multi-label classification using Binary Relevance and having some issues explaining the result:
The result is: [[ 0. 0.] [ 2. 2.]]
Does it mean that the first case is classified [0,0] and the 2nd is [2,2]? That does not look good at all. Or am I missing something else?
After the gentelmen answers now I'm getting the following error because of the y_train label [2**,0,**3,4] because of the zero
Traceback (most recent call last):
File "driver.py", line 22, in <module>
clf_dict[i] = clf.fit(x_train, y_tmp)
File "C:\Users\BaderEX\Anaconda22\lib\site-packages\sklearn\linear_model\logistic.py", line 1154, in fit
self.max_iter, self.tol, self.random_state)
File "C:\Users\BaderEX\Anaconda22\lib\site-packages\sklearn\svm\base.py", line 885, in _fit_liblinear
" class: %r" % classes_[0])
ValueError: This solver needs samples of at least 2 classes in the data, but the data contains only one class: 1
Updated code:
import numpy as np
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import *
numer_classes = 5
x_train = np.array([[1,2,3,4],[0,1,2,1],[1,2,0,3]])
y_train = [[0],[1,0,3],[2,0,3,4]]
x_test = np.array([[1,2,3,4],[0,1,2,1],[1,2,0,3]])
y_test = [[0],[1,0,3],[2,0,3,4]]
clf_dict = {}
for i in range(numer_classes):
y_tmp = []
for j in range(len(y_train)):
if i in y_train[j]:
y_tmp.append(1)
else:
y_tmp.append(0)
clf = LogisticRegression()
clf_dict[i] = clf.fit(x_train, y_tmp)
prediction_matrix = np.zeros((len(x_test),numer_classes))
for i in range(numer_classes):
prediction = clf_dict[i].predict(x_test)
prediction_matrix[:,i] = prediction
print('Predicted')
print(prediction_matrix)
Thanks
Upvotes: 3
Views: 6035
Reputation: 2999
For Binary Relevance you should make indicator classes: 0 or 1 for every label instead. scikit-multilearn provides a scikit-compatible implementation of the classifier.
Set:
def to_indicator_matrix(y_list):
y_train_matrix = np.zeros(shape=(len(y_list), max(map(len, y_list))+1), dtype='i8')
for i, y in enumerate(y_list):
y_train_matrix[i][y] = 1
return y_train_matrix
Given your y_train and y_test, run:
y_train = to_indicator_matrix(y_train)
y_test = to_indicator_matrix(y_test)
Your y_train is now:
array([[1, 1, 0],
[0, 1, 1],
[1, 0, 1]])
This should fix your problem. It is more comfortable to use the scikit-multilearn BinaryRelevance then your own code though. Try it out!
Run
pip install scikit-multilearn
And then try
from skmultilearn.problem_transform import BinaryRelevance
from sklearn.linear_model import LogisticRegression
import sklearn.metrics
# assume data is loaded using
# and is available in X_train/X_test, y_train/y_test
# initialize Binary Relevance multi-label classifier
# with gaussian naive bayes base classifier
classifier = BinaryRelevance(LogisticRegression(C=40,class_weight='balanced'), require_dense)
# train
classifier.fit(X_train, y_train)
# predict
predictions = classifier.predict(X_test)
# measure
print(sklearn.metrics.hamming_loss(y_test, predictions))
Upvotes: 3
Reputation: 426
I think you made a mistake in the implementation. For binary relevance, we need a separate classifier for each of the labels. There are three labels, thus there should be 3 classifiers. Each classifier will tell weather the instance belongs to a class or not. For example, the classifier corresponds to class 1 (clf[1]) will only tell weather the instance belongs to class 1 or not.
Thus, if you want to manually implement binary relevance, in the loop that creates the classifiers, the label should be binarized:
for i in range(numer_classes):
y_tmp = []
for j in range(len(y_train)):
if i in y_train[j]:
y_tmp.append(1)
else:
y_tmp.append(0)
clf = LogisticRegression()
clf_dict[i] = clf.fit(x_train, y_tmp)
However, if you use sklearn, things are much more convenient:
from sklearn.multiclass import OneVsRestClassifier
from sklearn.preprocessing import MultiLabelBinarizer
binarizer = MultiLabelBinarizer()
y_train_binarized = binarizer.fit_transform(y_train)
y_test_binarized = binarizer.fit_transform(y_test)
cls = OneVsRestClassifier(estimator=LogisticRegression())
cls.fit(x_train,y_train_binarized)
y_predict = cls.predict(x_test)
The results are something like: [[1 0 1] [0 1 1]] which means the first case is predicted as: [0,2] and the second case is predicted as [1,2]
Upvotes: 3