How to plot feature_importance for DecisionTreeClassifier?

Question

I need to plot feature_importances for DecisionTreeClassifier. Features are already found and target results are achieved, but my teacher tells me to plot feature_importances to see weights of contributing factors. I have no idea how to do it.

model = DecisionTreeClassifier(random_state=12345, max_depth=8,class_weight='balanced') 
model.fit(features_train,target_train)
model.feature_importances_

It gives me.

array([0.02927077, 0.3551379 , 0.01647181, ..., 0.03705096, 0.        ,
       0.01626676])

Why it is not attached to anything like max_depth and just an array of some numbers?

Ailurophile · Accepted Answer

Feature importance refers to a class of techniques for assigning scores to input features to a predictive model that indicates the relative importance of each feature when making a prediction.

Feature importance scores can be calculated for problems that involve predicting a numerical value, called regression, and those problems that involve predicting a class label, called classification.

Load the feature importances into a pandas series indexed by your dataframe column names, then use its plot method.

From Scikit Learn

Feature importances are provided by the fitted attribute feature_importances_ and they are computed as the mean and standard deviation of accumulation of the impurity decrease within each tree.

How are feature_importances in RandomForestClassifier determined?

For your example:

feat_importances = pd.Series(model.feature_importances_, index=df.columns)
feat_importances.nlargest(5).plot(kind='barh')

More ways to plot Feature Importances- Random Forest Feature Importance Chart using Python

How to plot feature_importance for DecisionTreeClassifier?

Answers (2)

Related Questions