badavadapav
badavadapav

Reputation: 91

Hyper-parameter Tuning for XGBoost for Multi-class Target Variable

I have a multi-classification problem (gotta predict 1,2 or 3) that I am trying to solve using XG-Boost. I am trying to fine tune my parameters using Randomized Search. Here is my code:

I have tried changing 'scoring' argument inside 'param_distributions' from 'auc_roc' to 'precision','f1_samples', 'jaccard' (which threw another error related to 'average' parameter because I have multiclass problem).

loss=['hinge','log','modifier_huber','squared_hinge','perceptron']
penalty = ['li','l2','elasticnet']
alpha = [0.0001, 0.001,0.01,0.1,1,10,100,1000]
learnin_rate = ['constant','optimal','invscaling','adaptive']
class_weight = [{0.3,0.5,0.2},{0.3,0.4,0.3}]
eta0 = [1,10,100]

xg_class = xgb.XGBClassifier(objective = "multi:softmax", colsample_bytree = 1,
gamma = 1,subsample = 0.8, learning_rate = 0.01, max_depth = 3,
alpha = 10,n_estimators = 1000, multilabel_ =True, num_classes = 3)

from sklearn.metrics import jaccard_score

param_distributions = dict(loss = loss, penalty=penalty, alpha=alpha, learnin_rate=learnin_rate, class_weight=class_weight, eta0=eta0)
random = RandomizedSearchCV(estimator = xg_class, param_distributions=param_distributions, 
scoring = jaccard_score(y_true=Y_miss_xgb_test, y_pred = preds_miss_xgb, average = 'micro'),
verbose = 1, n_jobs =-1, n_iter = 1000)

random_result = random.fit(X_miss_xgb_train, Y_miss_xgb_train)

The error I get is

ValueError: scoring should either be a single string or callable for single metric evaluation or a list/tuple of strings or a dict of scorer name mapped to the callable for multiple metric evaluation. Got 0.3996569468267582 of type

Upvotes: 0

Views: 4184

Answers (1)

Amine Benatmane
Amine Benatmane

Reputation: 1261

RandomizedSearchCV expects a single string or callable for single metric evaluation or a list/tuple of strings or a dict of scorer name mapped to the callable for multiple metric evaluation as a "scoring" parameter, but a float value was passed. jaccard_score(y_true=Y_miss_xgb_test, y_pred = preds_miss_xgb, average = 'micro') returns a float score (axactly 0.3996569468267582).

You can specify "jaccard_score" scoring as a string as follow:

random = RandomizedSearchCV(estimator = xg_class, 
                            param_distributions=param_distributions, 
                            scoring = "jaccard_score",
                            verbose = 1, 
                            n_jobs =-1, 
                            n_iter = 1000)

Upvotes: 2

Related Questions