Reputation: 3096
SITUATION
When I plot xgboost.plot_tree
I get a bunch of empty characters/boxes/blocks on the graph only instead of the titles, labels and numbers. I use more than 400 features so that can be a contributing factor for this.
CODE 1
fig, ax = plt.subplots(figsize=(170, 170))
plot_tree(xgbmodel, ax=ax)
plt.savefig("temp.pdf")
plt.show()
CODE 2
plot_tree(xgbmodel, num_trees=2)
fig = plt.gcf()
fig.set_size_inches(150, 100)
fig.savefig('tree.png')
ERROR
SOLUTIONS I have Tried
model.get_booster().get_dump(dump_format='text')
printed a out a bit more than 200'000 characters = 63 A4 size pages of 11size fonts of Calibri, that looks perfectly correct ex.: 0.0268656723\n\t\t\t\t\t34:[f0<6.5] yes=53,no=54,missing=53\n\t\t\t\t\t\
. Is it possible that I have this issue because it can not display so much text in such a normal size graph?Upvotes: 2
Views: 1381
Reputation: 3096
I have moved my whole environment to a local machine from an AWS EC2 than it run perfectly. The AWS EC2 some other weird things like it wasn't allowing to use Extension in Jupyter Lab. Both of them are Ubuntu 20.04 LTS.
Upvotes: 1
Reputation: 26690
I wasn't able to reproduce your error. Can you please add more details to your question and confirm that this code works? link to pima-indians-diabetes.csv
#!/usr/bin/env python3
# plot decision tree
from numpy import loadtxt
from xgboost import XGBClassifier
from xgboost import plot_tree
import matplotlib.pyplot as plt
import graphviz
# load data
dataset = loadtxt('pima-indians-diabetes.csv', delimiter=",")
# split data into X and y
X = dataset[:,0:8]
y = dataset[:,8]
# fit model no training data
model = XGBClassifier()
model.fit(X, y)
# plot/save fig
fig, ax = plt.subplots(figsize=(170, 170))
plot_tree(model, ax=ax)
plt.savefig("test.pdf")
I can't reproduce this issue/error. No matter which package version / char encoding / line endings / etc my notebook always renders the text correctly. The only thing I can suggest is installing a new virtual environment (e.g. miniconda) with current versions of the required packages (conda install notebook numpy matplotlib xgboost graphviz python-graphviz) and testing it again.
Also, make sure you don't have windows line endings (see: Matplotlib plotting some characters as blank square / https://github.com/jupyterlab/jupyterlab/issues/1104 / https://github.com/jupyterlab/jupyterlab/issues/3718 / https://github.com/jupyterlab/jupyterlab/pull/3882 ) and specify the font you are using (e.g. How to change fonts in matplotlib (python)?):
# plot decision tree
from numpy import loadtxt
from xgboost import XGBClassifier
from xgboost import plot_tree
from matplotlib.font_manager import FontProperties
import matplotlib.pyplot as plt
import graphviz
# load data
dataset = loadtxt('pima-indians-diabetes.csv', delimiter=",")
# split data into X and y
X = dataset[:,0:8]
y = dataset[:,8]
# fit model no training data
model = XGBClassifier()
model.fit(X, y)
# plot/save fig
prop = FontProperties()
prop.set_file('Arial.ttf')
fig, ax = plt.subplots(figsize=(170, 170))
plot_tree(model, ax=ax, fontproperties=prop)
plt.savefig("test.png")
fig.show()
Upvotes: 1