Reputation: 21
In sklearn, I want to to train a linear model from with one-dimensional input. But when I feed a [100 x 1] input vector and a [100 x 1] output vector into the linear_model.LinearRegression()'s fit function, I get the error "ValueError: Found arrays with inconsistent numbers of samples: [ 1 100]"
. It works fine with [7791 x 39] dimensional training input and [7791 x 1] training output.
starting regression training
(7791, 39)
(7791,)
done with regression training; starting probabilities converter training
(100,)
(100,)
Traceback (most recent call last):
File "makePickles.py", line 19, in <module>
train_probabilities_converter(scoresToProbabilities[:,1], scoresToProbabilities[:,2])
File "trainProbabilitiesConverter.py", line 18, in train_probabilities_converter
regr.fit(rawScores, empiricalProbability)
File "//anaconda/lib/python2.7/site-packages/sklearn/linear_model/base.py", line 376, in fit
y_numeric=True, multi_output=True)
File "//anaconda/lib/python2.7/site-packages/sklearn/utils/validation.py", line 454, in check_X_y
check_consistent_length(X, y)
File "//anaconda/lib/python2.7/site-packages/sklearn/utils/validation.py", line 174, in check_consistent_length
"%s" % str(uniques))
ValueError: Found arrays with inconsistent numbers of samples: [ 1 100]
Upvotes: 2
Views: 526
Reputation: 13218
Have you tried making your input data (100, 1) instead of (100,)? I know it is sometimes a problem with sklearn (because it could be 100 observations in dimension 1, or 1 observation in dimension 100).
You can do X_test = X_test[:, None]
to add a new axis. np.newaxis
also works and is a longer, but more explicit name. By the way, it is just an alias for None
(they refer to the same object):
>>> import numpy as np
>>> np.newaxis is None
True
Upvotes: 2