Nikita Okorokov
Nikita Okorokov

Reputation: 31

Python: ValueError too many values to unpack (expected 2)

I am trying to find a best xgboost model through GridSearchCV and as a cross_validation I want to use an April target data. Here is the code:

    x_train.head()

x_train

    y_train.head()

y_train

    from sklearn.model_selection import GridSearchCV
    from sklearn.model_selection import train_test_split
    from sklearn.metrics import mean_squared_error
    from sklearn.metrics import make_scorer
    from sklearn.ensemble import RandomForestRegressor
    from sklearn.model_selection import TimeSeriesSplit
    import xgboost as xg

    xgb_parameters={'max_depth':[3,5,7,9],'min_child_weight':[1,3,5]}
    xgb=xg.XGBRegressor(learning_rate=0.1, n_estimators=100,max_depth=5, min_child_weight=1, gamma=0, subsample=0.8, colsample_bytree=0.8)
    model=GridSearchCV(n_jobs=2,estimator=xgb,param_grid=xgb_parameters,cv=train_test_split(x_train,y_train,test_size=len(y_train['2016-04':'2016-04']), random_state=42, shuffle=False),scoring=my_func)
    model.fit(x_train,y_train)
    model.grid_scores_
    model.best_params_

But I have this error while I am training my model.

Error

Can anybody help me with this please? Or could someone suggest how can I split nonshuffled data to train/test to validate model on the last month?

Thanks for helping

Upvotes: 3

Views: 10159

Answers (1)

MaxU - stand with Ukraine
MaxU - stand with Ukraine

Reputation: 210832

The root cause for this error is the way how you used cv parameter in the GridSearchCV() call:

cv=train_test_split(x_train,y_train,test_size=len(y_train['2016-04':'2016-04'])

Here is an excerpt from the docstring for the cv parameter:

cv : int, cross-validation generator or an iterable, optional
    Determines the cross-validation splitting strategy.
    Possible inputs for cv are:
      - None, to use the default 3-fold cross validation,
      - integer, to specify the number of folds in a `(Stratified)KFold`,
      - An object to be used as a cross-validation generator.
      - An iterable yielding train, test splits.

    For integer/None inputs, if the estimator is a classifier and ``y`` is
    either binary or multiclass, :class:`StratifiedKFold` is used. In all
    other cases, :class:`KFold` is used.

    Refer :ref:`User Guide <cross_validation>` for the various
    cross-validation strategies that can be used here.

However train_test_split(x_train,y_train) returns 4 arrays:

X_train, X_test, y_train, y_test

this is causing: ValueError too many values to unpack (expected 2) error.

As a workaround you can specify one of the options specified above (docstring for the cv parameter)...

Upvotes: 6

Related Questions