Asif.Khan
Asif.Khan

Reputation: 147

ValueError: Found input variables with inconsistent numbers of samples: [12600, 4200]

In this code I am doing a time series split then using scikit learn I am creating a SVR model for prediction. My code is:

from sklearn import preprocessing as pre 
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import TimeSeriesSplit
from sklearn import svm
from sklearn.preprocessing import MinMaxScaler



X_feature = wind_speed

X_feature = X_feature.reshape(-1, 1)## Reshaping array to be 1D from 2D

y_label = Power
y_label = y_label.reshape(-1,1)

timeseries_split = TimeSeriesSplit(n_splits=3)
for train1, test1 in timeseries_split.split(X_feature):

    print("Training data:",train1, "Testing data test:", test1)
train1 = train1.reshape(-1,1)## Reshaping array to be 1D fron 2D
test1 = test1.reshape(-1,1)

timeseries_split = TimeSeriesSplit(n_splits=3)
for train, test in timeseries_split.split(y_label):
    print("Training data_1:",train, "Testing data test_1:", test)

scaler =pre.MinMaxScaler(feature_range=(0,1)).fit(train1)


scaled_wind_speed_train = scaler.transform(train1)
print("scaler", scaled_wind_speed_train)
scaled_wind_speed_test = scaler.transform(test1)

SVR_model = svm.SVR(kernel='rbf',C=100,gamma=.001).fit(scaled_wind_speed_train,train)

y_prediction = SVR_model.predict(y_label)


    print (y_prediction)
    SVR_model.score(scaled_wind_speed_test,train)

The error that I am receiving is:

Training data: [   0    1    2 ... 4197 4198 4199] Testing data test: [4200 4201 4202 ... 8397 8398 8399]
Training data: [   0    1    2 ... 8397 8398 8399] Testing data test: [ 8400  8401  8402 ... 12597 12598 12599]
Training data: [    0     1     2 ... 12597 12598 12599] Testing data test: [12600 12601 12602 ... 16797 16798 16799]
Training data_1: [   0    1    2 ... 4197 4198 4199] Testing data test_1: [4200 4201 4202 ... 8397 8398 8399]
Training data_1: [   0    1    2 ... 8397 8398 8399] Testing data test_1: [ 8400  8401  8402 ... 12597 12598 12599]
Training data_1: [    0     1     2 ... 12597 12598 12599] Testing data test_1: [12600 12601 12602 ... 16797 16798 16799]
scaler [[0.00000000e+00]
 [7.93713787e-05]
 [1.58742757e-04]
 ...
 [9.99841257e-01]
 [9.99920629e-01]
 [1.00000000e+00]]
/home/nbuser/anaconda3_501/lib/python3.6/site-packages/sklearn/utils/validation.py:475: DataConversionWarning: Data with input dtype int64 was converted to float64 by MinMaxScaler.
  warnings.warn(msg, DataConversionWarning)
[6153.41834275 6006.33852041 5997.57462806 ... 6569.44075144 6393.55696288
 6112.57831243]
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-53-925646f8c16a> in <module>()
     43 
     44 print (y_prediction)
---> 45 SVR_model.score(scaled_wind_speed_test,train)
     46 
     47 

~/anaconda3_501/lib/python3.6/site-packages/sklearn/base.py in score(self, X, y, sample_weight)
    385         from .metrics import r2_score
    386         return r2_score(y, self.predict(X), sample_weight=sample_weight,
--> 387                         multioutput='variance_weighted')
    388 
    389 

~/anaconda3_501/lib/python3.6/site-packages/sklearn/metrics/regression.py in r2_score(y_true, y_pred, sample_weight, multioutput)
    528     """
    529     y_type, y_true, y_pred, multioutput = _check_reg_targets(
--> 530         y_true, y_pred, multioutput)
    531 
    532     if sample_weight is not None:

~/anaconda3_501/lib/python3.6/site-packages/sklearn/metrics/regression.py in _check_reg_targets(y_true, y_pred, multioutput)
     73 
     74     """
---> 75     check_consistent_length(y_true, y_pred)
     76     y_true = check_array(y_true, ensure_2d=False)
     77     y_pred = check_array(y_pred, ensure_2d=False)

~/anaconda3_501/lib/python3.6/site-packages/sklearn/utils/validation.py in check_consistent_length(*arrays)
    202     if len(uniques) > 1:
    203         raise ValueError("Found input variables with inconsistent numbers of"
--> 204                          " samples: %r" % [int(l) for l in lengths])
    205 
    206 

ValueError: Found input variables with inconsistent numbers of samples: [12600, 4200]

I believe the error my be at : SVR_model.score(scaled_wind_speed_test,train) but I do not know how to resolve this. I have edited the indentations to the exact original but am not sure if any unintentional indentations may be causing an error.

Upvotes: 0

Views: 851

Answers (1)

Gambit1614
Gambit1614

Reputation: 8811

The following line should fix the error, assuming I have understood your code correctly

SVR_model.score(scaled_wind_speed_test,test)

Upvotes: 2

Related Questions