DionysoSong
DionysoSong

Reputation: 807

Python - Graphviz - Remove legend on nodes of DecisionTreeClassifier

I have a decision tree classifier from sklearn and I use pydotplus to show it. However I don't really like when there is a lot of informations on each nodes for my presentation (entropy, samples and value).

enter image description here

To explain it easier to people I would like to only keep the decision and the class on it. Where can I modify the code to do it ?

Thank you.

Upvotes: 3

Views: 2590

Answers (1)

E.Z
E.Z

Reputation: 1998

Accoring to the documentation, it is not possible to abstain from setting the additional information inside boxes. The only thing that you may implicitly omit is the impurity parameter.

However, I have done it the other explicit way which is somewhat crooked. First, I save the .dot file setting the impurity to False. Then, I open it up and convert it to a string format. I use regex to subtract the redundant labels and resave it.

The code goes like this:

import pydotplus  # pydot library: install it via pip install pydot
from sklearn.tree import DecisionTreeClassifier
from sklearn.tree import export_graphviz
from sklearn.datasets import load_iris

data = load_iris()
clf = DecisionTreeClassifier()
clf.fit(data.data, data.target)

export_graphviz(clf, out_file='tree.dot', impurity=False, class_names=True)

PATH = '/path/to/dotfile/tree.dot'
f = pydot.graph_from_dot_file(PATH).to_string()
f = re.sub('(\\\\nsamples = [0-9]+)(\\\\nvalue = \[[0-9]+, [0-9]+, [0-9]+\])', '', f)
f = re.sub('(samples = [0-9]+)(\\\\nvalue = \[[0-9]+, [0-9]+, [0-9]+\])\\\\n', '', f)

with open('tree_modified.dot', 'w') as file:
    file.write(f)

Here are the images before and after modification:

enter image description hereenter image description here

In your case, there seems to be more parameters in boxes, so you may want to tweak the code a little bit.

I hope that helps!

Upvotes: 7

Related Questions