Reputation: 115
In scikit-learn's RandomForestClassifier, there is no setting to specify how many samples each tree should be built from. That is, how big the subsets should be that are randomly pulled from the data to build each tree.
I'm having trouble finding how many samples scikit-learn pulls by default. Does anyone know?
Upvotes: 3
Views: 375
Reputation: 13097
I believe RandomForestClassier will use the entire training set to build each tree. Typically building each tree involves selecting the features which have the most predictive power(the ones which create the largest 'split'), and having more data makes computing that more accurate.
Upvotes: 1