Garglesoap
Garglesoap

Reputation: 575

XGBoost gpu fails to run with scikit RandomizedSearchCV

XGBoost works fine on both cpu and gpu but as soon as I add scikit's randomizedsearchcv for hyperparamater tuning it fails.

System: Ubuntu 20

Environment: conda virtual env with python 3.7

xgboost install: conda install -c anaconda py-xgboost-gpu

Code:

from sklearn.model_selection import cross_val_score, RandomizedSearchCV, train_test_split
import xgboost as xgb
from scipy.stats import uniform, randint

xgb_model = xgb.XGBRegressor(objective="reg:squarederror")
params = {}
params['eval_metric'] = 'rmse'
params['tree_method'] = 'gpu_hist'
params['colsample_bytree'] = uniform(0.7, 0.3)
params['gamma'] = uniform(0, 0.5)
params['learning_rate'] = uniform(0.03, 0.3)
params['max_depth'] = randint(2,6)
params['n_estimators'] = randint(100, 150)
params['subsample'] = uniform(0.6, 0.4)

search = RandomizedSearchCV(xgb_model, param_distributions=params, random_state=42, n_iter=200, cv=3, verbose=1, return_train_score=True) #n_jobs=8,

search.fit(X_train, y_train)
print(search)

Error:

Fitting 3 folds for each of 200 candidates, totalling 600 fits
[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.
/home/polabs1/anaconda3/envs/PoEnv_XGB_gpu/lib/python3.7/site-packages/sklearn/model_selection/_validation.py:552: FitFailedWarning: Estimator fit failed. The score on this train-test partition for these parameters will be set to nan. Details: 
Traceback (most recent call last):
  File "/home/polabs1/anaconda3/envs/PoEnv_XGB_gpu/lib/python3.7/site-packages/sklearn/model_selection/_validation.py", line 531, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "/home/polabs1/anaconda3/envs/PoEnv_XGB_gpu/lib/python3.7/site-packages/xgboost/sklearn.py", line 396, in fit
    callbacks=callbacks)
  File "/home/polabs1/anaconda3/envs/PoEnv_XGB_gpu/lib/python3.7/site-packages/xgboost/training.py", line 216, in train
    xgb_model=xgb_model, callbacks=callbacks)
  File "/home/polabs1/anaconda3/envs/PoEnv_XGB_gpu/lib/python3.7/site-packages/xgboost/training.py", line 74, in _train_internal
    bst.update(dtrain, i, obj)
  File "/home/polabs1/anaconda3/envs/PoEnv_XGB_gpu/lib/python3.7/site-packages/xgboost/core.py", line 1109, in update
    dtrain.handle))
  File "/home/polabs1/anaconda3/envs/PoEnv_XGB_gpu/lib/python3.7/site-packages/xgboost/core.py", line 176, in _check_call
    raise XGBoostError(py_str(_LIB.XGBGetLastError()))
xgboost.core.XGBoostError: Invalid Input: 's', valid values are: {'approx', 'auto', 'exact', 'gpu_exact', 'gpu_hist', 'hist'}

thanks guys

Upvotes: 1

Views: 1549

Answers (1)

tmrlvi
tmrlvi

Reputation: 2361

The param_distribution argument needs to be a dictionary of lists / array. The current code interprets eval_metric and tree_method arguments you put as

params['eval_metric'] = ['r', 'm', 's', 'e']
params['tree_method'] = ['g', 'p', 'u', '_', 'h', 'i', 's', 't']

To fix it, you want to replace the relevant lines by

params['eval_metric'] = ['rmse']
params['tree_method'] = ['gpu_hist']

Upvotes: 2

Related Questions