Reputation: 89
Getting this error while implementing XGboost for the titanic problem
Error in xgb.iter.update(bst$handle, dtrain, iteration - 1, obj) :
[03:26:03] amalgamation/../src/objective/multiclass_obj.cc:75: Check failed: label_error >= 0 && label_error < nclass SoftmaxMultiClassObj: label must be in [0, num_class), num_class=2 but found 2 in label.
Following is my code:
#Parameter ie no of class
nc <- length(unique(train_label))
nc
xgb_params <- list("objective"="multi:softprob",
"eval_metric"="mlogloss",
"num_class"=nc)
watchlist <- list(train=train_matix,test=test_matix)
#XGB Model
bst_model <- xgb.train(params = xgb_params,data = train_matix, nrounds = 100,watchlist = watchlist)
How can I resolve this?
Upvotes: 3
Views: 21137
Reputation: 31
try xgb.XGBRegressor(use_label_encoder=False) instead of XGBClassifier().
Upvotes: 0
Reputation: 61
I solved it like this. My class labels were -1, 0 and 1. So my num_class=3. I had to increment class labels by 1 in order to compatible with the range [0,3). Note that in this range 3 is excluded and valid labels are 0, 1, 2. So my converted class labels were 0,1,2.
In addition I change the code to use for multiple class classification.
objective has been changed to 'multi:softmax' and 'num_class' param is added.
xgb1 = XGBClassifier(
learning_rate=0.1,
n_estimators=1000,
max_depth=5,
min_child_weight=1,
gamma=0,
subsample=0.8,
colsample_bytree=0.8,
objective='multi:softmax',
nthread=4,
scale_pos_weight=1,
seed=27,
num_class=3,
)
In modelfit() finction 'auc' was replaced with 'merror'
def modelfit(alg, dtrain, predictors, useTrainCV=True, cv_folds=5, early_stopping_rounds=50):
if useTrainCV:
xgb_param = alg.get_xgb_params()
#change the class labels
dtrain[target] = dtrain[target] + 1
xgtrain = xgb.DMatrix(dtrain[predictors].values, label=dtrain[target].values)
cvresult = xgb.cv(xgb_param, xgtrain, num_boost_round=xgb_param['n_estimators'], nfold=cv_folds,
metrics='merror', early_stopping_rounds=early_stopping_rounds)
alg.set_params(n_estimators=cvresult.shape[0])
print(cvresult.shape[0])
# Fit the algorithm on the data
alg.fit(dtrain[predictors], dtrain[target], eval_metric='merror')
# Predict training set:
dtrain_predictions = alg.predict(dtrain[predictors])
# Print model report:
print("\nModel Report")
print("Accuracy : %.4g" % metrics.accuracy_score(dtrain[target].values, dtrain_predictions))
Upvotes: 6
Reputation: 3208
I'm guessing you have a multi label problem.
Since there's no example I can only guess you should pass nc+1
instead of nc.
Good luck.
Upvotes: 2