Reputation: 101
I am using xgboost in R.
I created the xgb matrix fine using a matrix as input, but when I reduce the number in columns in the matrix data, I receive an error.
This works:
> dim(ctt1)
[1] 6401 5901
> xgbmat1 <- xgb.DMatrix(
Matrix(data.matrix(ctt1)),
label = as.matrix(as.numeric(data$V2)) - 1
)
This does not:
> dim(ctt1[,nr])
[1] 6401 1048
xgbmat1 <- xgb.DMatrix(
Matrix(data.matrix(ctt1[,nr])),
label = as.matrix(as.numeric(data$V2)) - 1)
Error in xgb.setinfo(dmat, names(p), p[[1]]) : The length of labels must equal to the number of rows in the input data
Upvotes: 4
Views: 12809
Reputation: 76
Before splitting your data, you need to turn it into a data frame. For Exemplo:
data <- read.csv(...)
data = as.data.frame(data)
Now you can set your train data and test data to use in your "sparse.model.matrix" and "xgb.DMatrix".
Upvotes: 0
Reputation: 47
The proper way for creating the DBMatrix Like
xgtrain <- xgb.DMatrix(data = as.matrix(X_train[,-5]), label = `X_train$item_cnt_month)`
drop the label column in data parameter and use same data set for create label column in index five i have item_cnt_month i drop it at run time and use same data set for referring label column
Upvotes: 0
Reputation: 3043
In my case I fixed this error by changing assign operation:
labels <- df_train$target_feature
Upvotes: 5
Reputation: 793
For sparse matrices, xgboost R interface uses the CSC format creation method. The problem currently is that this method automatically determines the number of rows from the existing non-sparse values, and any completely sparse rows at the end are not counted in. A similar loss of completely sparse columns at the end can happen with the CSR sparse format. For more details see xgboost issue #1223 and also wikipedia on the sparse matrix formats.
Upvotes: 2
Reputation: 101
It turns out that by removing some columns, there are some rows with all 0s, and could not contribute to model.
Upvotes: 2