Scikit-learn 1 node decision tree?

Question

I’m a bit perplexed by this issue, I’ve created a list of lists (which is passed into numpy’s asarray to be stored in X) where each sublist is the features for a sample (current same value in each column as I haven’t parsed each feature to integer yet). Then created my y variable by numpy.fill with the same value for testing. I’m passing these 2 numpy arrays in to fit(X,y) where X =

array([[ 0,  1,  2, ..., -1, -1, -1],
   [ 0, -1,  2, ..., -1, -1, -1],
   [ 0, -1, -1, ..., -1, -1, -1],
   ..., 
   [ 0, -1, -1, ..., -1, -1, -1],
   [ 0, -1, -1, ..., -1, -1, -1],
   [ 0, -1,  2, ..., -1, -1, -1]])

and y =

[4 4 4 ..., 4 4 4]

However the resulting output is a 1 node decision tree with gini value 0. Wondering if anyone could shed some light on why this may be occurring. Thanks!

Maniteja · Accepted Answer

from what I understood the target value is 4 for all the samples. I suppose the tree has only one node, which predicts the target value as 4 for the test data since the target value is 4 for all the training data. And also the gini index is 0 since all of the samples are in the same class. Hope it helps !

Scikit-learn 1 node decision tree?

Answers (1)

Related Questions