Reputation: 479
I try to write a function in R which takes several variables from a dataframe as input and gives a vector with results as output.
Based on this post below I did write the function below. How can create a function using variables in a dataframe
Although I receive this warning message:
the condition has length > 1 and only the first element will be used
I have tried to solve it by the post below using sapply in the function although I do not succeed. https://datascience.stackexchange.com/questions/33351/what-is-the-problem-with-the-condition-has-length-1-and-only-the-first-elemen
# a data frame with columns a, x, y and z:
myData <- data.frame(a=1:5,
x=(2:6),
y=(11:15),
z=3:7)
myFun3 <- function(df, col1 = "x", col2 = "y", col3 = "z"){
result <- 0
if(df[,col1] == 2){result <- result + 10
}
if(df[,col2] == 11){result <- result + 100
}
return(result)
}
myFun3(myData)
> Warning messages:
> 1: In if (df[, col1] == 2) { :
> the condition has length > 1 and only the first element will be used
> 2: In if (df[, col2] == 11) { :
> the condition has length > 1 and only the first element will be used
Can someone explain me how I can apply the function over all rows of the dataframe? Thanks a lot!
Upvotes: 1
Views: 1277
Reputation: 887108
We need ifelse
instead of if/else
as if/else
is not vectorized
myFun3 <- function(df, col1 = "x", col2 = "y", col3 = "z"){
result <- numeric(nrow(df))
ifelse(df[[col1]] == 2, result + 10,
ifelse(df[[col2]] == 11, result + 100, result))
}
myFun3(myData)
#[1] 10 0 0 0 0
Or the OP's code can be Vectorize
d after making some changes i.e. remove the second if
with an else if
ladder
myFun3 <- Vectorize(function(x, y){
result <- 0
if(x == 2) {
result <- result + 10
} else if(y == 11){
result <- result + 100
} else result <- 0
return(result)
})
myFun3(myData$x, myData$y)
#[1] 10 0 0 0 0
Regarding the OP's doubts about when multiple conditions are TRUE, then want only the first to be executed, the ifelse
(nested - if more than two) or if/else if/else
(else if
ladder or if/else nested) both works because it is executed in that same order we specified the condition and it stops as soon as a TRUE condition occurred i.e. suppose we have multiple conditions
if(expr1) {
1
} else if(expr2) {
2
} else if(expr3) {
3
} else if(expr4) {
4
} else {
5}
checks the first expression ('expr1') first, followed by second, and so on. The moment it return TRUE, it exit i.e. it is a nested condition
if(expr1) {
1
} else {
if(expr2) {
2
} else {
if(expr3) {
3
} else {
if(expr4) {
4
} else 5
}
}
}
There is a cost for this i.e.. whereever we have the more values that matches the 1, only the expr1 is executed and thus saves time, but if there are more 5 values, then all those conditions are checked
Upvotes: 3