Relation between coefficients in linear regression and feature importance in decision trees

Question

Recently I have a Machine Learning(ML) project, which needs to identify the features(inputs, a1,a2,a3 ... an) that have large impacts on target/outputs.

I used linear regression to get the coefficients of the feature, and decision trees algorithm (for example Random Forest Regressor) to get important features (or feature importance).

Is my understanding right that the feature with large coefficient in linear regression shall be among the top list of importance of features in Decision tree algorithm?

Ahmed Ragab · Accepted Answer

Not really, if your input features are not normalized, you could have a relatively big co-efficient for features with a relatively big mean/std. If your features are normalized, then yes, this could be an indicator to the features importance, but there are still other things to consider.

You could try some of sklearn's feature selection classes that should do this automatically for you here.

Relation between coefficients in linear regression and feature importance in decision trees

Answers (2)

Related Questions