Reputation: 742
I want to make a decision tree and break it to lists (name , sign , val). I made the tree with this code :
//Get File
BufferedReader reader = new BufferedReader(new FileReader(PATH + "TempArffFile.arff"));
//Get the data
Instances data = new Instances(reader);
reader.close();
//Setting class attribute
data.setClassIndex(data.numAttributes() - 1);
//Make tree
J48 tree = new J48();
String[] options = new String[1];
options[0] = "-U";
tree.setOptions(options);
tree.buildClassifier(data);
//Print tree
System.out.println(tree);
now I need to break it to arrays how can i do that ?
Example : i get this tree :
title <= 1: bad (4.0)
title > 1
| positionMatch <= 1
| | countryCode <= 1: good (3.0/1.0)
| | countryCode > 1: bad (8.0/3.0)
| positionMatch > 1: good (4.0/1.0)
so i want to get 4 lists from that tree:
- title(<= 1) -> bad
- title(> 1) -> position(<= 1) -> countryCode(<= 1) -> good
- title(> 1) -> position(<= 1) -> countryCode(> 1) -> bad
- title(> 1) -> position(> 1) -> good
How can i do that ?
Upvotes: 2
Views: 901
Reputation: 670
First the answer with parsing lines seems buggy ( I can't comment on it :(), it seems that it replaces every '|' with the parent clause, but for a tree like :
A | C B | D
The replacement would be wrong, both '|' infront of C and D would be replaced by A.
Also extending class J48 and reimplementing toString() won't help because the tree is actually present in the private variable m_root.
Update:
A better solution to parse the string.
private void string(J48 j48) {
String tree = j48.toString();
String[] lines = tree.split("\n");
List<List<String>> lists = new ArrayList<List<String>>();
// Break lines into parts.
for(String line : lines){
List<String> temp = new ArrayList<String>();
while(line.indexOf("|") != -1){
temp.add("|");
line = line.replaceFirst("\\|", "");
}
temp.add(line.trim());
lists.add(temp);
}
// remove first 3 lines of the output.
for(int i = 0; i < 3; i++){
lists.remove(0);
}
// remove last 4 lines of the output.
for(int i = 0; i < 4; i++){
lists.remove(lists.size()-1);
}
// This is a ordered list of parents for any given node while traversing the tree.
List<String> parentClauses = new ArrayList<String>();
// this describes the depth
//int depth = -1;
// all the paths in the tree.
List<List<String>> paths = new ArrayList<List<String>>();
for (List<String> list : lists) {
int currDepth = 0;
for(int i = 0; i < list.size(); i++){
String token = list.get(i);
// find how deep is this node in the tree.
if (token.equals("|")) {
currDepth++;
}
else { // now we get to the string token for the node.
// if leaf, so we get one path..
if (token.contains(":")) {
List<String> path = new ArrayList<String>();
for (int index = 0; index < currDepth; index++) {
path.add(parentClauses.get(index));
}
path.add(token);
paths.add(path);
}
else {
// add this to the current parents list
parentClauses.add(currDepth, token);
}
}
}
}
// print each of the paths.
for (List<String> path : paths) {
String str = "";
for (String token : path) {
str += token + " AND ";
}
LOG.info(str + "\n");
}
}
Upvotes: 1
Reputation: 745
Not that nice but maybe better then nothing... Maybe it will give u an idea.
public static void split(String tree){
String[] lines = tree.split("\n");
List<List<String>> lists = new ArrayList<List<String>>();
for(String line : lines){
List<String> temp = new ArrayList<String>();
while(line.indexOf("|") != -1){
temp.add("|");
line = line.replaceFirst("\\|", "");
}
temp.add(line.trim());
lists.add(temp);
}
for(int i = 0; i < 3; i++){
lists.remove(0);
}
for(int i = 0; i < 4; i++){
lists.remove(lists.size()-1);
}
List<String> substitutes = new ArrayList<String>();
for(List<String> list : lists){
for(int i = 0; i < list.size(); i++){
if(!list.get(i).contains(":") && !list.get(i).equals("|") && !substitutes.contains(list.get(i))){
substitutes.add(list.get(i));
}
}
}
for(List<String> list : lists){
for(int i = 0; i < list.size(); i++){
if(list.get(i).equals("|")){
list.set(i, substitutes.get(i));
}
}
}
StringBuilder sb = new StringBuilder();
for(List<String> list : lists){
String line = "";
for(String s : list){
line = line+" "+s;
}
if(line.endsWith(")")){
sb.append(line+"\n");
}
}
System.out.println(sb.toString());
}
Input
petalwidth <= 0.6: Iris-setosa (50.0)
petalwidth > 0.6
| petalwidth <= 1.7
| | petallength <= 4.9: Iris-versicolor (48.0/1.0)
| | petallength > 4.9
| | | petalwidth <= 1.5: Iris-virginica (3.0)
| | | petalwidth > 1.5: Iris-versicolor (3.0/1.0)
| petalwidth > 1.7: Iris-virginica (46.0/1.0)
Number of Leaves : 5
Size of the tree : 9
Output:
petalwidth <= 0.6: Iris-setosa (50.0)
petalwidth > 0.6 petalwidth <= 1.7 petallength <= 4.9: Iris-versicolor (48.0/1.0)
petalwidth > 0.6 petalwidth <= 1.7 petallength > 4.9 petalwidth <= 1.5: Iris-virginica (3.0)
petalwidth > 0.6 petalwidth <= 1.7 petallength > 4.9 petalwidth > 1.5: Iris-versicolor (3.0/1.0)
petalwidth > 0.6 petalwidth > 1.7: Iris-virginica (46.0/1.0)
Upvotes: 2