exAres
exAres

Reputation: 4926

rfe in R's caret package giving error as : task 1 failed - "argument 1 is not a vector"

I have a training_predictors set with 56 columns, all of which are numeric. training_labels is a factor vector of 0 and 1.

I am using following list as subset sizes to be tested.

subset_sizes <- c(1:5, 10, 15, 20, 25)

Following is the list of modified rfFuncs functions.

rfRFE <- list(summary = defaultSummary, 
              fit = function(x, y, first, last, ...) {
                  library(randomForest)
                  randomForest(x, y, importance = first, ...)
              }, 
              pred = function(object, x) predict(object, x), 
              rank = function(object, x, y) {
                  vimp <- varImp(object)
                  vimp <- vimp[order(vimp$Overall, decreasing = TRUE),,drop = FALSE]
                  vimp$var <- rownames(vimp)
                  vimp
              }, 
              selectSize = pickSizeBest, 
              selectVar = pickVars)

I have declared the control function as:

rfeCtrl <- rfeControl(functions = rfRFE, 
                      method = "cv", 
                      number = 10, 
                      verbose = TRUE)

But when I run rfe function as shown below,

rfProfile <- rfe(training_predictors, 
                 training_labels, 
                 sizes = subset_sizes, 
                 rfeControl = rfeCtrl)

I am getting an error as :

Error in { : task 1 failed - "argument 1 is not a vector"

I also tried changing the vector subset_sizes, but still no luck. What am I doing wrong?

Update : I tried to run these steps one by one and the problem seems to be with the rank function. But I am still unable to figure out the problem.

Update: I found out the problem. varImp in rank function is not containing $Overall. But it contains columns with names 0 and 1. Why is it so? What does 0 and 1 signify (both column values are exactly same, by the way)? Also, how can I make varImp to return $Overall column? [as a temporary solution, I am creating a new column $Overall and attaching it to vimp in rank function.]

Upvotes: 5

Views: 2828

Answers (2)

Md Ismail Hossain
Md Ismail Hossain

Reputation: 1

I have found a solution for this same issue to fit a logistic regression model in rfe using caret. The solution as below:

glmFuncs$rank <-function (object, x, y){

  vimp <- varImp(object, scale = FALSE)
  loadNamespace("dplyr")

  vimp <- vimp$importance %>% 
    mutate(var=row.names(.)) %>%
    arrange(-Overall)

   vimp <- vimp[order(vimp$Overall, decreasing = TRUE), ,drop = FALSE]
   vimp
}

Upvotes: 0

topepo
topepo

Reputation: 14316

Using 0 and 1 as factor levels is problematic since those are not valid R column names. In your other SO post you probably would have received a message about using these as factor levels for your output.

Try using a factor outcome with some more informative levels that can be translated into valid R column names (for class probabilities).

Upvotes: 4

Related Questions