Guy Adini
Guy Adini

Reputation: 5494

Random Forests - Probability Estimates (+scikit-learn specific)

I am interested in understanding how probability estimates are calculated by random forests, both in general and specifically in Python's scikit-learn library (where probability estimated are returned by the predict_proba function).

Thanks, Guy

Upvotes: 7

Views: 5068

Answers (2)

smci
smci

Reputation: 33950

In addition to what Andreas/Dougal said, when you train the RF, turn on compute_importances=True. Then inspect classifier.feature_importances_ to see which features are occurring high-up in the RF's trees.

Upvotes: 2

Andreas Mueller
Andreas Mueller

Reputation: 28768

The probabilities returned by a forest are the mean probabilities returned by the trees in the ensemble (docs). The probabilities returned by a single tree are the normalized class histograms of the leaf a sample lands in.

Upvotes: 13

Related Questions