Reputation: 137
I need to create a variable which is similar to var1 if var2 is missing, similar to var2 if var1 is empty, the mean of var1 and var 2 if the two are not missing ((var1+var2)/2) and finally NA if both var1 and var2 are also missing.
I have data like:
library(tidyverse)
df <- tibble(
var1 = c(1, 2, 3, 4, NA, NA, 3, 2),
var2 = c(3, 5, NA, 2, 3, NA, 4, NA)
)
The result should be:
var1 var2 newvar
1 3 2
2 5 3.5
3 NA 3
4 2 3
NA 3 3
NA NA NA
3 4 3.5
2 NA 2
I have tried using main R recoding functions, also tried using case_when:
df <- df %>% mutate (
newvar = case_when(
var1 == NA ~ var2,
var2 == NA ~ var1,
TRUE ~ (var1+var2)/2
)
)
Not sure whether the last line would be correct but anyway the code didn't work due to missings, it says:
Error in mutate_impl(.data, dots) : Evaluation error: NAs are not allowed in subscripted assignments.
Upvotes: 1
Views: 158
Reputation: 4230
Almost there, just some minor edits and it's working on my end. It's usually better to use is.na(x)
instead of x == NA
. Also, your TRUE
at the end should check what you actually want, the case where none of them are NA
.
df %>% mutate (
newvar = case_when(
is.na(var1) ~ var2,
is.na(var2) ~ var1,
!is.na(var1) && !is.na(var2) ~ (var1+var2)/2
)
)
Produces
# A tibble: 8 x 3
var1 var2 newvar
<dbl> <dbl> <dbl>
1 1 3 2
2 2 5 3.5
3 3 NA 3
4 4 2 3
5 NA 3 3
6 NA NA NA
7 3 4 3.5
8 2 NA 2
Upvotes: 1
Reputation: 32548
df %>% mutate (
newvar = case_when(
xor(is.na(var1), is.na(var2)) ~ pmax(var1, var2, na.rm = TRUE),
!is.na(var1) & !is.na(var2) ~ (var1 + var2)/2,
TRUE ~ NaN
)
)
Upvotes: 2