Reputation: 143
I was implementing sklearn AdaBoostClassifier and I plotted the estimator_errors_ which said is the classification error for each estimator in the boosted ensemble.
This is the plot:
I have few questions: 1. is it the errors for the test set or train set? 2. why in 30 the error is 1? 3. is it accumulated error?
Thank you.
my code:
base = LinearSVC(tol=1e-10, loss='hinge', C=1000, max_iter=50000)
ada = AdaBoostClassifier(base_estimator=base ,algorithm='SAMME', n_estimators=n,random_state=10)
Upvotes: 2
Views: 1265
Reputation: 572
Sklearn AdaBoostClassifier has a default parameter for n_estimators=50 which I believe is used in your case. However, the boosting process may terminate early if one of the other conditions is reached. This maybe dictated by one of the stopping conditions for the base estimator or the SAMME algorithm.
In your case based on the plot, it seems like the boosting stops after 30 estimators. You can easily obtain the actual number of estimators using,
len(estimator)
where estimator is the fitted estimator using AdaBoostClassifier. The type of error depends on the function performed before printing estimator_errors_
.
estimator.predict(X_test)
estimator.estimator_errors_
shows error for the test data X_test
.
hope that helps.
Upvotes: 3