Determining the least confident scores with the svm classifier

Question

I'm working on a multilabel classification problem using a svm classifier in python. After training and I want to test and get the samples that the algorithm is least confident, i.e., the samples that are closer to the decision boundary. I can do this with the sklearn decision_function(X) which predicts confidence scores for samples. However, how to determine which one is the closest to the decision boundary? the ones with the lower values?

My code is below so code:

clf=OneVsRestClassifier(svc)
clf.fit(X_train,y_train)
y_pred = clf.predict(X_test)
df=clf.decision_function(X_test)
print(df[0])
print(y_pred[0])

I get the following output:

[ 0.77338405  0.65244097 -0.73863779 -0.59712787 -0.78753861 
-0.91293626  0.0031544 ]
[1 1 0 0 0 0 1]

In this case, which of the classes the algorithm is least certain of? -0.59712787 and 0.0031544?

s510 · Accepted Answer

Yes. The values that are close to zero are nearest to the decision boundary. If it is negative, it is in left of the decision boundary, similarly if it positive it is right of the decision boundary.

Determining the least confident scores with the svm classifier

Answers (1)

Related Questions