Reputation: 75
I have a series of if
statements in a function. It looks like this:
my_func <- function(data, selection) {
if (selection == 'p+c') {
predictors = 'chicago'
preds <- data
}
else if (selection== 'p') {
predictors = 'new_york'
preds <- data %>% dplyr::select(-c(region, sale))
}
else if (selection == 'c') {
predictors = 'california'
preds <- data %>% dplyr::select(region, sale)
}
# then the function does something else with predictors and preds,
# and returns a dataframe
}
my_func(my_data, selection = 'p')
I keep getting the warning that the condition has length > 1 and only the first element will be used
. Weirdly, it doesn't actually break anything (it all works as expected), but I still would rather amend this problem.
I read that this is a problem with vectorization, but I don't know how to overcome this.
I already tried replacing the if/else with ifelse
(as suggested in other posts) but this did not work, maybe because I do more than one operation at each if
statement. I did this:
ifelse (selection == 'p+c') {
predictors = 'chicago'
preds <- data
}
ifelse (selection== 'p') {
predictors = 'new_york'
preds <- data %>% dplyr::select(-c(region, sale))
}
ifelse (selection == 'c') {
predictors = 'california'
preds <- data %>% dplyr::select(region, sale)
}
Upvotes: 0
Views: 282
Reputation: 1959
You have two questions here. The message
the condition has length > 1...
arises because if()
is not vectorised. I assume selection has more than one value.
ifelse
are most useful when you have exactly two options. For multiple options, a decent option is nested else if()
statements:
Without the data, I can't check this, a nested else if
solution would be:
if (selection == 'p+c') {
predictors = 'chicago'
preds <- data
} else if(selection == 'p') {
predictors = 'new_york'
preds <- data %>% dplyr::select(-c(region, sale))
} else if (selection == 'c') {
predictors = 'california'
preds <- data %>% dplyr::select(region, sale)
} else { # Good practice to capture errors safely
stop("Selection not found")
}
If you need to keep selection as a vector, e.g. selection <- c("p+c", "c")
then you make the above statement into a function and pass it to an 'apply()' statement, e.g.
checkFunction <- function(selection) {
if(selection == 'p+c') {
predictors = 'chicago'
preds <- data
} else if(selection == 'p') {
predictors = 'new_york'
preds <- data %>% dplyr::select(-c(region, sale))
} else if (selection == 'c') {
predictors = 'california'
preds <- data %>% dplyr::select(region, sale)
} else { # Good practice to capture errors safely
stop("Selection not found")
}
return(list(predictors, preds))
}
output <- sapply(selection, checkFunction)
> output
p+c c
[1,] "chicago" "california
[2,] tbl_df,3 tbl_df,2
output[,1]
[[1]]
[1] "chicago"
[[2]]
# A tibble: 5 x 3
region sale other
<int> <int> <int>
1 1 2 3
2 2 3 4
3 3 4 5
4 4 5 6
5 5 6 7
Upvotes: 1
Reputation: 51998
ifelse
is a function, you need to assign the results of it to your variables (rather than placing the assignments inside the function call itself). Without a reproducible example (which you neglected to provide -- see How to make a great R reproducible example?) it is hard to be sure that the following code does exactly what you want, but something like:
predictors <- ifelse(selection == 'p+c',
'chicago',
ifelse(selection == 'p',
'new york',
ifelse(selection == 'c',
'california',
'NA')))
(with similar code for preds
).
Upvotes: 1