wildcat89
wildcat89

Reputation: 1285

How to Extract Parameters from XGBRegressor Function After Grid Search?

I'm new to XGBoost and parameter tuning, but I'm hoping you can help me.

I'm following along this tutorial:

https://www.datacamp.com/community/tutorials/xgboost-in-python

and it mentions trying to implement a Grid Search to fine tune the hyperparameters near the cross validation section.

I was able to use GridSearchCV to return a best_estimator set of parameters which looks like this:

XGBRegressor(alpha=5, base_score=0.5, booster='gbtree', colsample_bylevel=1,
       colsample_bynode=1, colsample_bytree=0.4, gamma=0,
       importance_type='gain', learning_rate=0.1, max_delta_step=0,
       max_depth=5, min_child_weight=1, missing=None, n_estimators=50,
       n_jobs=1, nthread=None, objective='reg:squarederror',
       random_state=123, reg_alpha=0, reg_lambda=1, scale_pos_weight=1,
       seed=None, silent=None, subsample=1, verbosity=1)

My question is, if I wanted to pass some of these now tuned parameters into the next cross validation step, how could I get the appropriate values for the below function from the list above?:

params = {"objective":"reg:squarederror",'colsample_bytree': **THE COLSAMPLE_BYTREE VALUE FROM ABOVE**,'learning_rate': 0.1,
                'max_depth': **THE MAX_DEPTH VALUE FROM ABOVE**, 'alpha': **THE ALPHA VALUE FROM ABOVE**}

cv_results = xgb.cv(dtrain=data_dmatrix, params=params, nfold=3,
                    num_boost_round=50,early_stopping_rounds=10,metrics="rmse", as_pandas=True, seed=123)

...or is this a silly question? Just trying to get my hands on some of these new skills using the boston multivariate housing dataset. If you want to see my code, just let me know but hoping this will be enough? Thanks!

Upvotes: 0

Views: 2924

Answers (1)

s3nh
s3nh

Reputation: 556

In my opinion, you do not need best_estimator for this task. You could use for example best_params or best_index instruction to gain information about parameters which are your point of interest.


best_params_ : dict 
Parameter setting that gave the best results on the hold out data.

For multi-metric evaluation, this is present only if refit is specified.

best_index_ : int
The index (of the cv_results_ arrays) which corresponds to the best candidate parameter setting.

The dict at search.cv_results_['params'][search.best_index_] gives the parameter setting for the best model, that gives the highest mean score (search.best_score_).

For multi-metric evaluation, this is present only if refit is specified.

Grid Search Docs

Then you can treat this params as dictionary and put key/values in proper places of your code, for example


your_best_res = cv_.best_params

'max_depth': your_best_res.max_depth, ...

Upvotes: 1

Related Questions