salix_august
salix_august

Reputation: 75

If...else error 'the condition has lenght >1...' with multiple operations at each statement

I have a series of if statements in a function. It looks like this:

my_func <- function(data, selection) {

 if (selection == 'p+c') {
 predictors = 'chicago'
 preds <- data
}
else if (selection== 'p') { 
  predictors = 'new_york'
  preds <- data %>% dplyr::select(-c(region, sale))
}
else if (selection == 'c') {
  predictors = 'california'
  preds <- data %>% dplyr::select(region, sale)
} 
# then the function does something else with predictors and preds, 
#  and returns a dataframe  
}

my_func(my_data, selection = 'p')

I keep getting the warning that the condition has length > 1 and only the first element will be used. Weirdly, it doesn't actually break anything (it all works as expected), but I still would rather amend this problem.

I read that this is a problem with vectorization, but I don't know how to overcome this.

I already tried replacing the if/else with ifelse (as suggested in other posts) but this did not work, maybe because I do more than one operation at each if statement. I did this:

 ifelse (selection == 'p+c') {
 predictors = 'chicago'
 preds <- data
}
ifelse (selection== 'p') { 
  predictors = 'new_york'
  preds <- data %>% dplyr::select(-c(region, sale))
}
ifelse (selection == 'c') {
  predictors = 'california'
  preds <- data %>% dplyr::select(region, sale)
}

Upvotes: 0

Views: 282

Answers (2)

Tech Commodities
Tech Commodities

Reputation: 1959

You have two questions here. The message

the condition has length > 1...

arises because if() is not vectorised. I assume selection has more than one value.

ifelse are most useful when you have exactly two options. For multiple options, a decent option is nested else if() statements:

Without the data, I can't check this, a nested else if solution would be:

 if (selection == 'p+c') {
   predictors = 'chicago'
   preds <- data
 } else if(selection == 'p') { 
   predictors = 'new_york'
   preds <- data %>% dplyr::select(-c(region, sale))
 } else if (selection == 'c') {
   predictors = 'california'
   preds <- data %>% dplyr::select(region, sale)
 } else { # Good practice to capture errors safely
   stop("Selection not found")
 }

If you need to keep selection as a vector, e.g. selection <- c("p+c", "c") then you make the above statement into a function and pass it to an 'apply()' statement, e.g.

checkFunction <- function(selection) {
    if(selection == 'p+c') {
      predictors = 'chicago'
      preds <- data
    } else if(selection == 'p') { 
      predictors = 'new_york'
      preds <- data %>% dplyr::select(-c(region, sale))
    } else if (selection == 'c') {
      predictors = 'california'
      preds <- data %>% dplyr::select(region, sale)
    } else { # Good practice to capture errors safely
      stop("Selection not found")
    }
  
  return(list(predictors, preds))
}
   
output <- sapply(selection, checkFunction)

 > output
     p+c       c        
[1,] "chicago" "california
[2,] tbl_df,3  tbl_df,2 

output[,1]

[[1]]
[1] "chicago"

[[2]]
# A tibble: 5 x 3
  region  sale other
   <int> <int> <int>
1      1     2     3
2      2     3     4
3      3     4     5
4      4     5     6
5      5     6     7

Upvotes: 1

John Coleman
John Coleman

Reputation: 51998

ifelse is a function, you need to assign the results of it to your variables (rather than placing the assignments inside the function call itself). Without a reproducible example (which you neglected to provide -- see How to make a great R reproducible example?) it is hard to be sure that the following code does exactly what you want, but something like:

predictors <- ifelse(selection == 'p+c',
    'chicago',
     ifelse(selection == 'p',
         'new york',
          ifelse(selection == 'c',
              'california',
              'NA')))

(with similar code for preds).

Upvotes: 1

Related Questions