Rosy
Rosy

Reputation: 841

What's the difference between predict_proba and decision_function in scikit-learn?

I'm studying a scikit-learn example (Classifier comparison) and got confused with predict_proba and decision_function.

They plot the classification results by drawing the contours using either Z = clf.decision_function(), or Z = clf.predict_proba().

What's the differences between these two? Is it so that each classification method has either of the two as score?

Which one is more proper to interpret the classification result and how should I choose from the two?

Upvotes: 43

Views: 33453

Answers (2)

serv-inc
serv-inc

Reputation: 38147

Your example is

if hasattr(clf, "decision_function"):
    Z = clf.decision_function(np.c_[xx.ravel(), yy.ravel()])
else:
    Z = clf.predict_proba(np.c_[xx.ravel(), yy.ravel()])[:, 1]

so the code uses decision_function if it exists. On the SVM case, predict_proba is computed (in the binary case)

using Platt scaling

which is both "expensive" and has "theoretical issues". That's why decision_function is used here. (as @Ami said, this is the margin - the distance to the hyperplane, which is accessible without much further computation). In the SVM case, it is advised to

use decision_function instead of predict_proba.

There are other decision_functions: SGDClassifier's. Here, predict_proba depends on the loss function, and decision_function is universally available.

Upvotes: 12

Ami Tavory
Ami Tavory

Reputation: 76297

The latter, predict_proba is a method of a (soft) classifier outputting the probability of the instance being in each of the classes.

The former, decision_function, finds the distance to the separating hyperplane. For example, a(n) SVM classifier finds hyperplanes separating the space into areas associated with classification outcomes. This function, given a point, finds the distance to the separators.

I'd guess that predict_prob is more useful in your case, in general - the other method is more specific to the algorithm.

Upvotes: 44

Related Questions