ForeverLearner
ForeverLearner

Reputation: 145

Determining the least confident scores with the svm classifier

I'm working on a multilabel classification problem using a svm classifier in python. After training and I want to test and get the samples that the algorithm is least confident, i.e., the samples that are closer to the decision boundary. I can do this with the sklearn decision_function(X) which predicts confidence scores for samples. However, how to determine which one is the closest to the decision boundary? the ones with the lower values?

My code is below so code:

clf=OneVsRestClassifier(svc)
clf.fit(X_train,y_train)
y_pred = clf.predict(X_test)
df=clf.decision_function(X_test)
print(df[0])
print(y_pred[0])

I get the following output:

[ 0.77338405  0.65244097 -0.73863779 -0.59712787 -0.78753861 
-0.91293626  0.0031544 ]
[1 1 0 0 0 0 1]

In this case, which of the classes the algorithm is least certain of? -0.59712787 and 0.0031544?

Upvotes: 0

Views: 141

Answers (1)

s510
s510

Reputation: 2822

Yes. The values that are close to zero are nearest to the decision boundary. If it is negative, it is in left of the decision boundary, similarly if it positive it is right of the decision boundary.

Upvotes: 1

Related Questions