How to interpret feature importance for ensemble methods?

Question

I'm using ensemble methods (random forest, xgbclassifier, etc) for classification.

One important aspect is feature importance prediction, which is like below:

           Importance
Feature-A   0.25
Feature-B   0.09
Feature-C   0.08
.......

This model achieves accuracy score around 0.85; obviously Feature-A is dominantly important, so I decided to remove Feature-A and calculated again.

However, after removing Feature-A, I still found a good performance with accuracy around 0.79.

This doesn't make sense to me, because Feature-A contributes 25% for the model, if removed, why accuracy score is barely affected?

I know ensemble methods hold an advantage to combine 'weak' features into 'strong' ones, so accuracy score mostly relies on aggregation and less sensitive to important feature removal?

Thanks

How to interpret feature importance for ensemble methods?

Answers (1)

Related Questions