Soumya
Soumya

Reputation: 431

How method predict_proba works for input samples X in DecisionTreeClassifier in sklearn?

DecisionTreeClassifier has a method predict_proba which calculates the probability of input data point X . How this predict probability is calculated for an already trained model ?

Upvotes: 0

Views: 506

Answers (1)

Antoine Dubuis
Antoine Dubuis

Reputation: 5324

The predicted class probability is the fraction of samples of the same class in a leaf. This means that if your leaf contains 10 x 1 and 90 x 0. The probability that the label is 1 will be 10% as in this example:

from sklearn.tree import DecisionTreeClassifier
import numpy as np

X = np.zeros((100, 1))
y = np.zeros((100, ))
y[-10:] = 1
dtc = DecisionTreeClassifier(max_depth=1).fit(X, y)
dtc.predict_proba([[0]])

which output:

array([[0.9, 0.1]])

Upvotes: 1

Related Questions