Python - Graphviz - Remove legend on nodes of DecisionTreeClassifier

Question

I have a decision tree classifier from sklearn and I use pydotplus to show it. However I don't really like when there is a lot of informations on each nodes for my presentation (entropy, samples and value).

To explain it easier to people I would like to only keep the decision and the class on it. Where can I modify the code to do it ?

Thank you.

E.Z · Accepted Answer

Accoring to the documentation, it is not possible to abstain from setting the additional information inside boxes. The only thing that you may implicitly omit is the impurity parameter.

However, I have done it the other explicit way which is somewhat crooked. First, I save the .dot file setting the impurity to False. Then, I open it up and convert it to a string format. I use regex to subtract the redundant labels and resave it.

The code goes like this:

import pydotplus  # pydot library: install it via pip install pydot
from sklearn.tree import DecisionTreeClassifier
from sklearn.tree import export_graphviz
from sklearn.datasets import load_iris

data = load_iris()
clf = DecisionTreeClassifier()
clf.fit(data.data, data.target)

export_graphviz(clf, out_file='tree.dot', impurity=False, class_names=True)

PATH = '/path/to/dotfile/tree.dot'
f = pydot.graph_from_dot_file(PATH).to_string()
f = re.sub('(\\nsamples = [0-9]+)(\\nvalue = $$[0-9]+, [0-9]+, [0-9]+$$)', '', f)
f = re.sub('(samples = [0-9]+)(\\nvalue = $$[0-9]+, [0-9]+, [0-9]+$$)\\n', '', f)

with open('tree_modified.dot', 'w') as file:
    file.write(f)

Here are the images before and after modification:

In your case, there seems to be more parameters in boxes, so you may want to tweak the code a little bit.

I hope that helps!

Python - Graphviz - Remove legend on nodes of DecisionTreeClassifier

Answers (1)

Related Questions