Reputation: 63
I've created a benchmark object for a (binary) classification task with heavy class imbalance in the training data and thrown in few classification learners into it including the featureless learner. I further obtained the benchmark results using the measure "classif.ce" to compare the learners' performance.
I note that the classif.ce value for featureless learner is exactly equal to the ratio of the 'truth' values to the total no. of observations in my task which makes sense. I've got few other learners which are much more accurate than the featureless learner (call them better_lrns) on "classif.ce".
My question, or rather the clarification I'm seeking is, can I be satisfied that better_lrns are indeed better than the featureless learner despite the heavy class imbalance?
Since it's not clear from the benchmark method which "Level" the featureless classifier is using (see here: https://mlr3.mlr-org.com/reference/mlr_learners_classif.featureless.html), it would be great if someone can confirm please. If the classif.ce for featureless learner is "negative class / total observations" then my results are good else I'll need to dig deeper.
Upvotes: 0
Views: 70
Reputation: 2829
By definition, the accuracy of the featureless learner will always be equal to the fraction of the most frequent class in any classification task. It serves as a baseline for the performance of the other learners since it is a naive classification.
Hence, I would not think you have to dig deeper because of this.
Upvotes: 0