Reputation: 136
I often find myself having to apply the following condition: I have a table with multiple binary columns rated yes/no or 0/1. I have to create a new intermediate column in the calculations with the following rule: if all columns are "no", then the new column is "no", if at least one column has "yes", then the summary column must express "yes". I usually do this with case_when, and it works well (see example).
library(tidyverse)
#create a table for reproducible example
set.seed(001)
carac1 <- round(runif(100),0)
carac2 <- round(runif(100),0)
carac3 <- round(runif(100),0)
data <- data.frame(carac1,carac2,carac3)
#apply case_when with complex condition
data <- data %>%
mutate(carac_all = case_when(
carac1 == 0 & carac2 == 0 & carac3 == 0 ~ "Always no",
carac1 == 1 | carac2 == 1 | carac3 == 1 ~ "yes at least one time",
TRUE ~ NA_character_))
This give me exactly what i want :
carac1 carac2 carac3 carac_all
1 0 1 0 yes at least one time
2 0 0 0 Always no
3 1 0 1 yes at least one time
4 1 1 0 yes at least one time
(This example is with a number 0/1 but sometimes it's with a character yes/no or other categorial variable, such as color ... so the trick to use >0 is not simple to implement in these cases.)
The problem is that this code forces me to enter the name of each column in the code. On my last file, I have 120 successive columns to analyse ... Is there any way to use case_when with this kind of conditions on a range of columns? I tried carac1:carac3 == 0, but it doesn't work, and anyway I don't see how to express the "at least one column says yes".
Thanks for your help.
TLDR: I want to simplify the code I'm currently using so that I don't have to enter the name of each variable in the code, but a range of variables.
Upvotes: 1
Views: 926
Reputation: 136
All the proposed solutions work, but they involve a sum calculation: if sum >= 1, then 1 appears at least once. I finally found the solution that works as I imagined it, i.e. also with categorical variables. It is necessary to go through if_any and if_all :
library(tidyverse)
set.seed(001)
#create a table for reproducible example
carac1 <- round(runif(100),0)
carac2 <- round(runif(100),0)
carac3 <- round(runif(100),0)
data <- data.frame(carac1,carac2,carac3)
data <- data %>%
mutate(carac_all = case_when(
if_any(carac1:carac3, ~.x == "1") == T ~ "yes at least one time",
if_all(carac1:carac3, ~.x == "0") == T ~ "always no",
TRUE ~ NA_character_))
Many thanks to all
Upvotes: 0
Reputation: 6206
A simple way using rowSums
:
data$carac_all = ifelse(rowSums(data) > 0, "yes at least one time", "Always no")
carac1 carac2 carac3 carac_all
1 0 1 0 yes at least one time
2 0 0 0 Always no
3 1 0 1 yes at least one time
4 1 1 0 yes at least one time
5 0 1 0 yes at least one time
6 1 0 1 yes at least one time
7 1 0 1 yes at least one time
8 1 0 0 yes at least one time
9 1 1 0 yes at least one time
10 0 1 1 yes at least one time
11 0 1 1 yes at least one time
12 0 1 0 yes at least one time
13 1 0 1 yes at least one time
14 0 0 1 yes at least one time
15 1 0 1 yes at least one time
16 0 0 0 Always no
17 1 1 1 yes at least one time
18 1 0 1 yes at least one time
19 0 0 1 yes at least one time
20 1 1 0 yes at least one time
21 1 1 0 yes at least one time
22 0 0 0 Always no
23 1 0 0 yes at least one time
24 0 0 1 yes at least one time
25 0 1 1 yes at least one time
Upvotes: 1
Reputation: 16988
You could use a lookup
vector with 1:3
being the column indices of your carac
's.
lookup <- c("TRUE" = "yes at least one time", "FALSE" = "Always no")
data$carac_all <- lookup[paste0(apply(data[, 1:3], 1, sum) > 0)]
Upvotes: 0
Reputation: 79142
3. Update: using dplyr 1.0.4: if_any() and if_all()
data %>%
mutate(
carac_all = case_when(
if_all(contains("carac"), ~. < 1) ~ "Always no",
if_any(contains("carac"), ~. >= 1) ~ "yes at least one time",
TRUE ~ NA_character_))
2. UPdate: Thanks to the valuable comment of Martin Gal:
data %>%
mutate(carac_all = case_when(
rowSums(across(carac1:carac3)) < 1 ~ "Always no",
rowSums(across(carac1:carac3)) >=1 ~ "yes at least one time",
TRUE ~ NA_character_))
Update: after clarification:
data %>%
mutate(sum_carac = carac1+carac2+carac3) %>%
mutate(carac_all = case_when(
sum_carac < 1 ~ "Always no",
sum_carac >=1 ~ "yes at least one time",
TRUE ~ NA_character_)) %>%
select(-sum_carac)
carac1 carac2 carac3 carac_all
1 0 1 0 yes at least one time
2 0 0 0 Always no
3 1 0 1 yes at least one time
4 1 1 0 yes at least one time
5 0 1 0 yes at least one time
6 1 0 1 yes at least one time
7 1 0 1 yes at least one time
8 1 0 0 yes at least one time
9 1 1 0 yes at least one time
10 0 1 1 yes at least one time
11 0 1 1 yes at least one time
12 0 1 0 yes at least one time
13 1 0 1 yes at least one time
14 0 0 1 yes at least one time
15 1 0 1 yes at least one time
16 0 0 0 Always no
17 1 1 1 yes at least one time
18 1 0 1 yes at least one time
19 0 0 1 yes at least one time
20 1 1 0 yes at least one time
21 1 1 0 yes at least one time
22 0 0 0 Always no
23 1 0 0 yes at least one time
24 0 0 1 yes at least one time
25 0 1 1 yes at least one time
26 0 0 1 yes at least one time
27 0 1 0 yes at least one time
28 0 0 0 Always no
29 1 0 0 yes at least one time
30 0 1 1 yes at least one time
31 0 1 0 yes at least one time
32 1 0 0 yes at least one time
33 0 0 0 Always no
34 0 1 1 yes at least one time
35 1 1 0 yes at least one time
36 1 1 1 yes at least one time
37 1 1 1 yes at least one time
38 0 1 1 yes at least one time
39 1 1 0 yes at least one time
40 0 1 0 yes at least one time
41 1 1 0 yes at least one time
42 1 1 1 yes at least one time
43 1 0 1 yes at least one time
44 1 0 0 yes at least one time
45 1 1 0 yes at least one time
46 1 0 0 yes at least one time
47 0 0 0 Always no
48 0 1 0 yes at least one time
49 1 0 0 yes at least one time
50 1 1 1 yes at least one time
51 0 1 1 yes at least one time
52 1 1 1 yes at least one time
53 0 0 0 Always no
54 0 0 1 yes at least one time
55 0 1 0 yes at least one time
56 0 0 0 Always no
57 0 1 0 yes at least one time
58 1 0 0 yes at least one time
59 1 0 0 yes at least one time
60 0 0 1 yes at least one time
61 1 0 1 yes at least one time
62 0 1 0 yes at least one time
63 0 0 0 Always no
64 0 1 1 yes at least one time
65 1 1 1 yes at least one time
66 0 0 0 Always no
67 0 0 0 Always no
68 1 0 1 yes at least one time
69 0 1 1 yes at least one time
70 1 0 0 yes at least one time
71 0 1 0 yes at least one time
72 1 1 0 yes at least one time
73 0 1 0 yes at least one time
74 0 0 0 Always no
75 0 0 0 Always no
76 1 1 0 yes at least one time
77 1 1 0 yes at least one time
78 0 1 0 yes at least one time
79 1 1 0 yes at least one time
80 1 1 1 yes at least one time
81 0 0 0 Always no
82 1 0 1 yes at least one time
83 0 1 1 yes at least one time
84 0 1 0 yes at least one time
85 1 1 0 yes at least one time
86 0 0 0 Always no
87 1 1 0 yes at least one time
88 0 1 0 yes at least one time
89 0 1 0 yes at least one time
90 0 1 0 yes at least one time
91 0 1 0 yes at least one time
92 0 0 0 Always no
93 1 0 1 yes at least one time
94 1 1 0 yes at least one time
95 1 0 1 yes at least one time
96 1 1 1 yes at least one time
97 0 0 0 Always no
98 0 1 0 yes at least one time
99 1 0 0 yes at least one time
100 1 1 1 yes at least one time
First answer:
We could use across
from dplyr
package
library(dplyr)
data %>%
mutate(across(starts_with("carac"), ~case_when(
. == 0 ~ "Always no",
. == 1 ~ "yes at least one time",
TRUE ~ NA_character_), .names ="x_{.col}")) %>%
select(carac1:x_carac1)
carac1 carac2 carac3 carac_all
1 0 1 0 yes at least one time
2 0 0 0 Always no
3 1 0 1 yes at least one time
4 1 1 0 yes at least one time
5 0 1 0 yes at least one time
6 1 0 1 yes at least one time
7 1 0 1 yes at least one time
8 1 0 0 yes at least one time
9 1 1 0 yes at least one time
10 0 1 1 yes at least one time
11 0 1 1 yes at least one time
12 0 1 0 yes at least one time
13 1 0 1 yes at least one time
14 0 0 1 yes at least one time
15 1 0 1 yes at least one time
16 0 0 0 Always no
17 1 1 1 yes at least one time
18 1 0 1 yes at least one time
19 0 0 1 yes at least one time
20 1 1 0 yes at least one time
21 1 1 0 yes at least one time
22 0 0 0 Always no
23 1 0 0 yes at least one time
24 0 0 1 yes at least one time
25 0 1 1 yes at least one time
26 0 0 1 yes at least one time
27 0 1 0 yes at least one time
28 0 0 0 Always no
29 1 0 0 yes at least one time
30 0 1 1 yes at least one time
31 0 1 0 yes at least one time
32 1 0 0 yes at least one time
33 0 0 0 Always no
34 0 1 1 yes at least one time
35 1 1 0 yes at least one time
36 1 1 1 yes at least one time
37 1 1 1 yes at least one time
38 0 1 1 yes at least one time
39 1 1 0 yes at least one time
40 0 1 0 yes at least one time
41 1 1 0 yes at least one time
42 1 1 1 yes at least one time
43 1 0 1 yes at least one time
44 1 0 0 yes at least one time
45 1 1 0 yes at least one time
46 1 0 0 yes at least one time
47 0 0 0 Always no
48 0 1 0 yes at least one time
49 1 0 0 yes at least one time
50 1 1 1 yes at least one time
51 0 1 1 yes at least one time
52 1 1 1 yes at least one time
53 0 0 0 Always no
54 0 0 1 yes at least one time
55 0 1 0 yes at least one time
56 0 0 0 Always no
57 0 1 0 yes at least one time
58 1 0 0 yes at least one time
59 1 0 0 yes at least one time
60 0 0 1 yes at least one time
61 1 0 1 yes at least one time
.........
Upvotes: 2