Ellen Xu
Ellen Xu

Reputation: 101

xgb.DMatrix Error: The length of labels must equal to the number of rows in the input data

I am using xgboost in R.

I created the xgb matrix fine using a matrix as input, but when I reduce the number in columns in the matrix data, I receive an error.

This works:

> dim(ctt1)

[1] 6401 5901

> xgbmat1 <- xgb.DMatrix(
     Matrix(data.matrix(ctt1)),
     label = as.matrix(as.numeric(data$V2)) - 1
  )

This does not:

> dim(ctt1[,nr])

[1] 6401 1048

xgbmat1 <- xgb.DMatrix(
    Matrix(data.matrix(ctt1[,nr])),
    label = as.matrix(as.numeric(data$V2)) - 1)

Error in xgb.setinfo(dmat, names(p), p[[1]]) : The length of labels must equal to the number of rows in the input data

Upvotes: 4

Views: 12809

Answers (5)

Gabriel Idalino
Gabriel Idalino

Reputation: 76

Before splitting your data, you need to turn it into a data frame. For Exemplo:

data <- read.csv(...)

data = as.data.frame(data)

Now you can set your train data and test data to use in your "sparse.model.matrix" and "xgb.DMatrix".

Upvotes: 0

Abdul Haseeb
Abdul Haseeb

Reputation: 47

The proper way for creating the DBMatrix Like

    xgtrain <- xgb.DMatrix(data = as.matrix(X_train[,-5]), label = `X_train$item_cnt_month)`

drop the label column in data parameter and use same data set for create label column in index five i have item_cnt_month i drop it at run time and use same data set for referring label column

Upvotes: 0

Andrii
Andrii

Reputation: 3043

In my case I fixed this error by changing assign operation:

labels <- df_train$target_feature

Upvotes: 5

Vadim Khotilovich
Vadim Khotilovich

Reputation: 793

For sparse matrices, xgboost R interface uses the CSC format creation method. The problem currently is that this method automatically determines the number of rows from the existing non-sparse values, and any completely sparse rows at the end are not counted in. A similar loss of completely sparse columns at the end can happen with the CSR sparse format. For more details see xgboost issue #1223 and also wikipedia on the sparse matrix formats.

Upvotes: 2

Ellen Xu
Ellen Xu

Reputation: 101

It turns out that by removing some columns, there are some rows with all 0s, and could not contribute to model.

Upvotes: 2

Related Questions