Reputation: 675
I am trying to coalesce multiple columns using multiple conditional statements. In the example below I want to coalesce with A taking precedence over B followed by C where A is >= 0.1 and < 30, while where A is NA, < 0.1 or > 30 I was to coalesce with B taking precedence over A followed by C.
Below is an example dataset:
df <- data.frame(1:8)
df$A <- c(102, 0.04, 0.1, NA_real_, 0.01, 0.01, 0.2, NA_real_)
df$B <- c(20.2, 50.1, 10.1, 6.1, 7.1, NA_real_, 8.1, NA_real_)
df$C <- c(NA_real_, 4.1, NA_real_, NA_real_, NA_real_, 8.1, NA_real_, 10.1)
A B C
1 102 20.2 NA
2 0.04 50.1 4.1
3 0.1 10.1 NA
4 NA 6.1 NA
5 0.01 7.1 NA
6 0.01 NA 8.1
7 0.2 8.1 NA
8 NA NA 10.1
Below is the desired output:
A B C new_col
1 102 20.2 NA 20.2
2 0.04 50.1 4.1 50.1
3 0.1 10.1 NA 0.1
4 NA 6.1 NA 6.1
5 0.01 7.1 NA 7.1
6 0.01 NA 8.1 8.1
7 0.2 8.1 NA 0.2
8 NA NA 10.1 10.1
I tried to solve this using mutate and the coalesce functions in the code below but did not get the desired output (in many cases this works OK as long as the value in column A is a value where the value is NA, Nulls are produced in the output).
df <- df %>%
mutate(new_col = if_else(A >= 0.1 & A <= 30,
coalesce(A, B, C),
coalesce(B, A, C)))
A B C new_col
1 102 20.2 NA 20.2
2 0.04 50.1 4.1 50.1
3 0.1 10.1 NA 0.1
4 NA 6.1 NA NULL
5 0.01 7.1 NA 7.1
6 0.01 NA 8.1 0.01
7 0.2 8.1 NA 0.2
8 NA NA 10.1 NULL
Upvotes: 5
Views: 433
Reputation: 887048
We can use case_when
library(dplyr)
df %>%
mutate(new_col = case_when(A >= 0.1 & A <= 30 & !is.na(A) ~
coalesce(A, B, C), TRUE ~ coalesce(B, A, C)))
Upvotes: 1
Reputation: 388907
Your attempt is correct but you need to handle NA
's since NA
in if_else
returns NA
.
library(dplyr)
df %>%
mutate(new_col = if_else(A >= 0.1 & A <= 30 & !is.na(A),
coalesce(A, B, C),
coalesce(B, A, C)))
# A B C new_col
#1 102.00 20.2 NA 20.20
#2 0.04 50.1 4.1 50.10
#3 0.10 10.1 NA 0.10
#4 NA 6.1 NA 6.10
#5 0.01 7.1 NA 7.10
#6 0.01 NA 8.1 0.01
#7 0.20 8.1 NA 0.20
#8 NA NA 10.1 10.10
data
df <- data.frame(A = c(102, 0.04, 0.1, NA_real_, 0.01, 0.01, 0.2, NA_real_),
B = c(20.2, 50.1, 10.1, 6.1, 7.1, NA_real_, 8.1, NA_real_),
C = c(NA_real_, 4.1, NA_real_, NA_real_, NA_real_, 8.1, NA_real_, 10.1))
Upvotes: 1