Reputation: 5182
On data with a few features I train a random forest for regression purposes and also gradient boosted regression trees. For both I calculate the feature importance, I see that these are rather different, although they achieve similar scores.
For the random forest regression:
MAE: 59.11
RMSE: 89.11
Importance:
Feature 1: 64.87
Feature 2: 0.10
Feature 3: 29.03
Feature 4: 0.09
Feature 5: 5.89
For the gradient boosted regression trees:
MAE: 58.70
RMSE: 90.59
Feature 1: 65.18
Feature 2: 5.67
Feature 3: 13.61
Feature 4: 4.26
Feature 5: 11.27
Why is this ? I thought maybe because with gradient boosted regression trees, the trees are more shallow than with random forests. But I am not sure.
Upvotes: 0
Views: 7369
Reputation: 40973
Though they are both tree based they are still different algorithms, so each calculates the feature importances differently, here is the relevant code:
scikit-learn/sklearn/ensemble/gradient_boosting.py
def feature_importances_(self):
total_sum = np.zeros((self.n_features, ), dtype=np.float64)
for stage in self.estimators_:
stage_sum = sum(tree.feature_importances_
for tree in stage) / len(stage)
total_sum += stage_sum
importances = total_sum / len(self.estimators_)
return importances
scikit-learn/sklearn/ensemble/forest.py
def feature_importances_(self):
all_importances = Parallel(n_jobs=self.n_jobs, backend="threading")(
delayed(getattr)(tree, 'feature_importances_')
for tree in self.estimators_)
return sum(all_importances) / self.n_estimators
So, different trees and different ways to combine the trees.
Upvotes: 2