Reputation: 1073
I created a simple function to determine the difference between 2 variables in a dataframe
detYearDisc <- function(x,y)
{
if (x < y)
return("L")
if (x > y)
return("G")
if (x == y)
return("N")
}
The dataframe df can contain NA on x or y or both . When I run the mapply function
df$DiscInd = mapply(detYearDisc, df$X,df$Y)
I get the following error:
Error in if (x < y) return("L") : missing value where TRUE/FALSE needed
Is this because I got NA on x or y value??
Upvotes: 0
Views: 427
Reputation: 12937
Yes, the reason is that either of them has NA
value. See the followings:
mapply(detYearDisc, 1,2)
#[1] "L"
mapply(detYearDisc, 2,2)
#[1] "N"
mapply(detYearDisc, 2,1)
#[1] "G"
mapply(detYearDisc, 2,NA)
#Error in if (x < y) return("L") : missing value where TRUE/FALSE needed
To handle it, you can add the following as the first line in your function:
if (is.na(x) | is.na(y))
return("Not a number!")
However, you can achieve the same with this simple ifelse
in a vectorized manner:
ifelse(df$x>df$y, "G", ifelse(df$x<df$y, "L", "N"))
In case of NA
, it will return NA
. E.g. for:
df
x y
1 1 5
2 3 0
3 5 1
4 NA 4
Will give you:
[1] "L" "G" "G" NA
Alternatively, thanks to @alistaire for pointing out case_when
from the dplyr
package, you could also do:
f <- function(x,y){
case_when(
(is.na(x) | is.na(y)) ~ "NA",
x>y ~ "G",
x<y ~ "L",
TRUE ~ "N"
)}
So, you would get the same result by calling the function f(df$x, df$y)
.
Upvotes: 2