Reputation: 13
I made my own, sort of, "decision tree" to find the optimal segmentation of a set of transactions. Using a ttest and a recurring function that splits the data into two using the most significant segmentation variable found in a set of possible segmentation variables at that instance.
The output is this:
('Blue', (('Mid', (42.0, ('Low', (11.64, ('High', (13.55, ('Very Low', (0.0, ('Very High', (3.29, 6.25)))))))))), ('Yellow', (('Mid', (44.39, ('Mid High', (31.61, 13.25)))), ('Mid Low', (47.89, ('Mid', (16.36, ('Very Low', (0.24, ('Low', (6.23, ('Red', (('Mid High', (1.15, ('JA', (0.0, ('Very High', (3.91, ('Low High', (3.76, ('High', (3.21, 1.89)))))))))), ('Low low', (25.33, ('High High', (8.92, ('Mid Mid', (6.28, 3.35))))))))))))))))))))
How could I make a visualization of this? I guess, something similar to a Decision Tree but have no idea how to execute.
Upvotes: 1
Views: 548
Reputation: 15309
Using ete3, you can plot your tree after converting it to Newick format:
dtree = ('Blue', (('Mid', (42.0, ('Low', (11.64, ('High', (13.55, ('Very Low', (0.0, ('Very High', (3.29, 6.25)))))))))), ('Yellow', (('Mid', (44.39, ('Mid High', (31.61, 13.25)))), ('Mid Low', (47.89, ('Mid', (16.36, ('Very Low', (0.24, ('Low', (6.23, ('Red', (('Mid High', (1.15, ('JA', (0.0, ('Very High', (3.91, ('Low High', (3.76, ('High', (3.21, 1.89)))))))))), ('Low low', (25.33, ('High High', (8.92, ('Mid Mid', (6.28, 3.35))))))))))))))))))))
from ete3 import Tree
def newick(t):
if type(t) != tuple:
return f'{t}'
if type(t[0]) == str:
return f'({newick(t[1])} {t[0]})'
return f'({newick(t[0])}, {newick(t[1])})'
t = Tree(f'{newick(dtree)};', format=1)
print(t.get_ascii(show_internal=True))
Result:
There is also a graphical viewer but I wasn't able to get internal node labels to show up so we will have to make do with ASCII art for now.
Upvotes: 2