rishai
rishai

Reputation: 515

catboost eval_metrics return value

I'm using CatBoostClassifier's eval_metrics to compute some metrics on my test set, and I'm confused about its output. For a given metric, by default, it seems to return an array, of size equaling the number of iterations.

This seems to be inconsistent with the predict function, which returns a single value only. Which number in the array returned by eval_metrics is consistent with the predict function?

I checked the documentation at https://catboost.ai/docs/concepts/python-reference_catboostclassifier_eval-metrics.html#python-reference_catboostclassifier_eval-metrics__output-format, but it's still not clear to me.

Upvotes: 1

Views: 822

Answers (1)

Ranika Nisal
Ranika Nisal

Reputation: 980

The Catboost classifier is a type of Ensemble Classifiers which uses Boosting Methods. Simply put, Boosting algorithms iteratively train weaker algorithms (Decision Trees in this case) to make predictions. Each Tree that is created learns from the collective errors that the previous weaker trees made and tries to learn from those errors. Catboost is based on Gradient Boosting which I won't dwell to deep into. What is relevant here is that a number of weaker trees are generated in the process, and when you call the eval_metrics() method you are getting the eval metric for each of the generated trees. You specify the number of trees generated when you provided iterations, num_boost_round, n_estimators or num_trees when creating the model (If not specified it has a default value of 1000).

The other arguments you specify to the eval_metrics() method will define the range of trees taken ntree_start to ntree_end at intervals of eval_period. If these aren't provided you will get the specified metrics for your data for all of the weaker trees generated, which is why you get a list of values.

Upvotes: 2

Related Questions