Elly
Elly

Reputation: 129

SystemML Decision Tree - "NUMBER OF SAMPLES AT NODE 1.0 CANNOT BE REDUCED TO MATCH 10"

I am trying to run a decision tree on SystemML standalone version on Windows (https://github.com/apache/incubator-systemml/blob/master/scripts/algorithms/decision-tree.dml) but I keep receiving the error "NUMBER OF SAMPLES AT NODE 1.0 CANNOT BE REDUCED TO MATCH 10. THIS NODE IS DECLAR ED AS LEAF!". It seems like the code is not computing any split, although I am able to perform tree via R. Has anyone used this algorithm before and has some tips on how to solve the error? Thank you

Upvotes: 1

Views: 85

Answers (1)

mboehm7
mboehm7

Reputation: 115

This message generally indicates that a split on the best categorical or scale features would not give any additional gain.

I would recommend to

  1. Investigate the computed gain (best_cat_gain, best_scale_gain)

  2. Double check that the meta data (num_cat_features, num_scale_features) is correctly recognised.

You could simply put additional print statements into the script to do that. In case the meta data is invalid, you might want to check that the optional input R has the right layout as described in the header of the script.

If this does not help, please share the input arguments, format of input data, etc and we'll have a closer look.

Upvotes: 1

Related Questions