Reputation: 3
I’m a bit perplexed by this issue, I’ve created a list of lists (which is passed into numpy’s asarray to be stored in X) where each sublist is the features for a sample (current same value in each column as I haven’t parsed each feature to integer yet). Then created my y variable by numpy.fill with the same value for testing. I’m passing these 2 numpy arrays in to fit(X,y) where X =
array([[ 0, 1, 2, ..., -1, -1, -1],
[ 0, -1, 2, ..., -1, -1, -1],
[ 0, -1, -1, ..., -1, -1, -1],
...,
[ 0, -1, -1, ..., -1, -1, -1],
[ 0, -1, -1, ..., -1, -1, -1],
[ 0, -1, 2, ..., -1, -1, -1]])
and y =
[4 4 4 ..., 4 4 4]
However the resulting output is a 1 node decision tree with gini value 0. Wondering if anyone could shed some light on why this may be occurring. Thanks!
Upvotes: 0
Views: 255
Reputation: 36
from what I understood the target value is 4 for all the samples. I suppose the tree has only one node, which predicts the target value as 4 for the test data since the target value is 4 for all the training data. And also the gini index is 0 since all of the samples are in the same class. Hope it helps !
Upvotes: 1