Reputation: 431
DecisionTreeClassifier has a method predict_proba which calculates the probability of input data point X . How this predict probability is calculated for an already trained model ?
Upvotes: 0
Views: 506
Reputation: 5324
The predicted class probability is the fraction of samples of the same class in a leaf. This means that if your leaf contains 10 x 1
and 90 x 0
. The probability that the label is 1 will be 10%
as in this example:
from sklearn.tree import DecisionTreeClassifier
import numpy as np
X = np.zeros((100, 1))
y = np.zeros((100, ))
y[-10:] = 1
dtc = DecisionTreeClassifier(max_depth=1).fit(X, y)
dtc.predict_proba([[0]])
which output:
array([[0.9, 0.1]])
Upvotes: 1