PaulG
PaulG

Reputation: 296

Xgboost: using single test observation?

I want to fit a time series model using xgboost for R and I want to use only the last observation for testing the model (in a rolling window forecast, there will be more in total). But when I include only a single value in the test data I get the error: Error in xgb.DMatrix(data = X[n, ], label = y[n]) : xgb.DMatrix does not support construction from double. Is it possible to do this, or do I need a minimum of 2 test points?

Reproducible example:

library(xgboost)
n = 1000
X = cbind(runif(n,0,20), runif(n,0,20))
y = X %*% c(2,3) + rnorm(n,0,0.1)

train = xgb.DMatrix(data  = X[-n,],
                    label = y[-n])

test = xgb.DMatrix(data   = X[n,],
                    label = y[n]) # error here, y[.] has 1 value

test2 = xgb.DMatrix(data   = X[(n-1):n,],
                    label = y[(n-1):n]) # works here, y[.] has 2 values

There's another post here that addresses a similar issue, however it refers to the predict() function, whereas I refer to the test data that will later go into the watchlist argument of xgboost and used e.g. for early stopping.

Upvotes: 1

Views: 508

Answers (1)

kangaroo_cliff
kangaroo_cliff

Reputation: 6222

The problem here is with the subset operation of the matrix with a single index. See,

class(X[n, ])
# [1] "numeric"

class(X[n,, drop = FALSE])
#[1] "matrix" "array" 

Use X[n,, drop = FALSE] to get the test sample.

test = xgb.DMatrix(data   = X[n,, drop = FALSE], label = y[n])

xgb.model <- xgboost(data = train, nrounds = 15)
predict(xgb.model, test)
# [1] 62.28553

Upvotes: 2

Related Questions