Reputation: 296
I want to fit a time series model using xgboost for R and I want to use only the last observation for testing the model (in a rolling window forecast, there will be more in total). But when I include only a single value in the test data I get the error: Error in xgb.DMatrix(data = X[n, ], label = y[n]) : xgb.DMatrix does not support construction from double
. Is it possible to do this, or do I need a minimum of 2 test points?
Reproducible example:
library(xgboost)
n = 1000
X = cbind(runif(n,0,20), runif(n,0,20))
y = X %*% c(2,3) + rnorm(n,0,0.1)
train = xgb.DMatrix(data = X[-n,],
label = y[-n])
test = xgb.DMatrix(data = X[n,],
label = y[n]) # error here, y[.] has 1 value
test2 = xgb.DMatrix(data = X[(n-1):n,],
label = y[(n-1):n]) # works here, y[.] has 2 values
There's another post here that addresses a similar issue, however it refers to the predict()
function, whereas I refer to the test
data that will later go into the watchlist
argument of xgboost and used e.g. for early stopping.
Upvotes: 1
Views: 508
Reputation: 6222
The problem here is with the subset operation of the matrix
with a single index. See,
class(X[n, ])
# [1] "numeric"
class(X[n,, drop = FALSE])
#[1] "matrix" "array"
Use X[n,, drop = FALSE]
to get the test sample.
test = xgb.DMatrix(data = X[n,, drop = FALSE], label = y[n])
xgb.model <- xgboost(data = train, nrounds = 15)
predict(xgb.model, test)
# [1] 62.28553
Upvotes: 2