Mridul Garg
Mridul Garg

Reputation: 497

Troubleshooting XGBoost in R

I have a dataset with 25000 rows and 761 columns, which includes one binary response column. My binary response had values '-1' and '1'. I was trying to run xgboost on it, and keep getting an error which says-

xg_base<-xgboost(data = features,label = output,objective="binary:logistic",eta=1,nthreads=2,nrounds = 10
             , verbose = T, print.every.n = 5)


Error in xgb.iter.update(bst$handle, dtrain, i - 1, obj) : 
label must be in [0,1] for logistic regression

I changed the levels of my response using the following command-

levels(output)[levels(output)=="-1"] <- "0"

I still keep getting the same error, and am not sure what exactly the issue is. One important point is that this is a rare event detection problem, with the proportion of positive cases being 1% of the total observations. Could that be the reason I'm getting the error?

Upvotes: 2

Views: 7505

Answers (2)

arun
arun

Reputation: 11013

Just so this may help someone trying to convert a factor variable with levels 0 and 1 into labels for input to XGBoost, you need to be aware that you need to subtract 1 after converting to integer (or numeric):

> f <- as.factor(c(0, 1, 1, 0))

# XGBoost will not accept this for label
> as.integer(f)
[1] 1 2 2 1

# Correct label
> as.integer(f) - 1
[1] 0 1 1 0

Upvotes: 11

David
David

Reputation: 51

After you change the -1's to 0's, change output from factor to numeric:

output <- as.numeric(levels(output))[output]

I don't think the fact that this is a rare event detection problem is related to the error.

Upvotes: 5

Related Questions