Reputation: 2862
I am using a multi-attribute dataset for classification purpose. I am using WEKA API on java.The dataset have both categorical and numerical variables. When i run the dataset on weka-GUI i get a better result with 16 leaves in 26 sized tree. But when i do the same using java code i only get 3 leaves in 5 sized tree . Here is my java code
public static Evaluation classify(Classifier model,
Instances trainingSet, Instances testingSet) throws Exception {
//return the classification model after training with train set and test with test set
Evaluation evaluation = new Evaluation(trainingSet);
model.buildClassifier(trainingSet);
evaluation.evaluateModel(model, testingSet);
//System.out.println(model);
return evaluation;
}
Classifier models = new J48(); // a decision tree
models.setOptions(optionsj);
FastVector predictions = new FastVector();
// For each training-testing split pair, train and test the classifier
for (int i = 0; i < trainingSplits.length; i++) {
Evaluation validation = classify(models, trainingSplits[i], testingSplits[i]);
predictions.appendElements(validation.predictions());
System.out.println(validation.toSummaryString("\nResults\n======\n", false));
}
System.out.println(models.toString());
How to make sure the j-48 take all the attributes in the dataset? what i did wrong?
Upvotes: 0
Views: 566
Reputation: 26
there is a parameter you have to set which is called either pruned or non-pruned. pruned mean the tree only display the most important leaves of the decision tree. non-pruned mean it displayed every variable in the decision tree. here you are just using pruned tree. if you want the full tree set
pruned = "False"
Upvotes: 1