Reputation: 41
I am using sklearn SGDClassifier for predicting my data set. I have text which should not be classified in any label/category, but I am amazed, if I am giving test data as "kjhd askdhajksdh asd askh", it is still getting classified to one of the given categories.
I have worked with probabilities as well but still having a valuable probability for this junk text.
My question is, can classifier return something like "No match found" etc for such cases?
Upvotes: 0
Views: 173
Reputation: 36609
No. Classifier will classify any input to one of the labels in training with the highest probability, however small are these probabilities.
You can use decision_function for setting a threshold of a label. Something like:
threshold = 0.25
if confidence_score < threshold:
print("No match found")
Upvotes: 1