Reputation: 73
I've currently got a decision tree displaying the features names as X[index]
, i.e. X[0], X[1], X[2]
, etc.
from sklearn import tree
from sklearn.tree import DecisionTreeClassifier
dt = DecisionTreeClassifier()
dt.fit(X_train, y_train)
# plot tree
plt.figure(figsize=(20,16))# set plot size (denoted in inches)
tree.plot_tree(dt,fontsize=10)
Im looking to replace these X[featureNumber] with the actual feature name.
so instead of it displaying X[0], I would want it to display the feature name returned by X.columns.values[0]
(I don't know if this code is correct).
Im also aware there is an easy way of doing this using graphviz, but for some reason I cant get graphviz running in Jupiter, so Im looking for a way of doing it without.
Photo of current decision tree:
Upvotes: 6
Views: 17120
Reputation: 44838
This is explained in the documentation:
sklearn.tree.plot_tree(decision_tree, *, max_depth=None, feature_names=None, class_names=None, label='all', filled=False, impurity=True, node_ids=False, proportion=False, rotate='deprecated', rounded=False, precision=3, ax=None, fontsize=None)
feature_names
: list of strings, default=None
Names of each of the features. If
None
, generic names will be used (“X[0]”, “X[1]”, …).
Upvotes: 9