Reputation: 3716

How are "feature_importances_" ordered in Scikit-learn's RandomForestRegressor

If I run a model (called clf in this case), I get output that looks like this. How can I tie this to the feature inputs that were used to train the classifier?

>>> clf.feature_importances_

array([ 0.01621506,  0.18275428,  0.09963659,... ])

Upvotes: 11

Answers (3)

Ahmed Taha Hagag

Reputation: 27

The order is the order of the features/attributes of your training/data set.

You can display these importance scores next to their corresponding attribute/features names as below:

attributes = list(your_data_set)

sorted(zip(clf.feature_importances_, attributes), reverse=True)

The output could be something similar:

[(0.01621506, 'feature1'),
(0.09963659, 'feature2'),
(0.18275428, 'feature3'),
...
...

Upvotes: 0

Abhishek Parida

Reputation: 35

You may save the result in a pandas data frame as follows:

pandas.DataFrame({'col_name': clf.feature_importances_}, index=x.columns).sort_values(by='col_name', ascending=False)

By sorting it in a descending manner we get a hint to significant features.

Upvotes: 3

Krishan Gupta

Reputation: 3716

As mentioned in the comments, it looks like the order or feature importances is the order of the "x" input variable (which I've converted from Pandas to a Python native data structure). I use this code to generate a list of types that look like this: (feature_name, feature_importance).

zip(x.columns, clf.feature_importances_)

Upvotes: 17

How are &quot;feature_importances_&quot; ordered in Scikit-learn&#39;s RandomForestRegressor

Answers (3)

Related Questions

How are "feature_importances_" ordered in Scikit-learn's RandomForestRegressor