Reputation: 103
I am doing some machine learning task on Python. I need to build RandomForest and then build a graph that will show how the quality of the training and test samples depends on the number of trees in the Random Forest. Is it necessary to build a new Random Forest each time with a certain number of trees? Or I can somehow iteratively add trees (if it possible, can you give the example of code how to do that)?
Upvotes: 10
Views: 6112
Reputation: 6756
You can use the warm start
parameter of the RandomForestClassifier
to do just that.
Here's an example you can adapt to your specific needs:
errors = []
growing_rf = RandomForestClassifier(n_estimators=10, n_jobs=-1,
warm_start=True, random_state=1514)
for i in range(40):
growing_rf.fit(X_train, y_train)
growing_rf.n_estimators += 10
errors.append(log_loss(y_valid, growing_rf.predict_proba(X_valid)))
_ = plt.plot(errors, '-r')
Here's what I got:
Upvotes: 19