Aytan
Aytan

Reputation: 136

Create indicator column based on presence of 0/1 in all other columns

I often find myself having to apply the following condition: I have a table with multiple binary columns rated yes/no or 0/1. I have to create a new intermediate column in the calculations with the following rule: if all columns are "no", then the new column is "no", if at least one column has "yes", then the summary column must express "yes". I usually do this with case_when, and it works well (see example).

library(tidyverse)

#create a table for reproducible example
set.seed(001)
carac1 <- round(runif(100),0)
carac2 <- round(runif(100),0)
carac3 <- round(runif(100),0)
data <- data.frame(carac1,carac2,carac3)

#apply case_when with complex condition
data <- data %>%
  mutate(carac_all = case_when(
    carac1 == 0 & carac2 == 0 & carac3 == 0 ~ "Always no",
    carac1 == 1 | carac2 == 1 | carac3 == 1 ~ "yes at least one time",
    TRUE ~ NA_character_))

This give me exactly what i want :

   carac1 carac2 carac3             carac_all
1        0      1      0 yes at least one time
2        0      0      0             Always no
3        1      0      1 yes at least one time
4        1      1      0 yes at least one time

(This example is with a number 0/1 but sometimes it's with a character yes/no or other categorial variable, such as color ... so the trick to use >0 is not simple to implement in these cases.)

The problem is that this code forces me to enter the name of each column in the code. On my last file, I have 120 successive columns to analyse ... Is there any way to use case_when with this kind of conditions on a range of columns? I tried carac1:carac3 == 0, but it doesn't work, and anyway I don't see how to express the "at least one column says yes".

Thanks for your help.

TLDR: I want to simplify the code I'm currently using so that I don't have to enter the name of each variable in the code, but a range of variables.

Upvotes: 1

Views: 926

Answers (4)

Aytan
Aytan

Reputation: 136

All the proposed solutions work, but they involve a sum calculation: if sum >= 1, then 1 appears at least once. I finally found the solution that works as I imagined it, i.e. also with categorical variables. It is necessary to go through if_any and if_all :

library(tidyverse)
set.seed(001)
#create a table for reproducible example
carac1 <- round(runif(100),0)
carac2 <- round(runif(100),0)
carac3 <- round(runif(100),0)
data <- data.frame(carac1,carac2,carac3)

data <- data %>%
  mutate(carac_all = case_when(
    if_any(carac1:carac3, ~.x == "1") == T ~ "yes at least one time",
    if_all(carac1:carac3, ~.x == "0") == T ~ "always no",
    TRUE ~ NA_character_))

Many thanks to all

Upvotes: 0

user438383
user438383

Reputation: 6206

A simple way using rowSums:

data$carac_all = ifelse(rowSums(data) > 0, "yes at least one time", "Always no")
   carac1 carac2 carac3             carac_all
1       0      1      0 yes at least one time
2       0      0      0             Always no
3       1      0      1 yes at least one time
4       1      1      0 yes at least one time
5       0      1      0 yes at least one time
6       1      0      1 yes at least one time
7       1      0      1 yes at least one time
8       1      0      0 yes at least one time
9       1      1      0 yes at least one time
10      0      1      1 yes at least one time
11      0      1      1 yes at least one time
12      0      1      0 yes at least one time
13      1      0      1 yes at least one time
14      0      0      1 yes at least one time
15      1      0      1 yes at least one time
16      0      0      0             Always no
17      1      1      1 yes at least one time
18      1      0      1 yes at least one time
19      0      0      1 yes at least one time
20      1      1      0 yes at least one time
21      1      1      0 yes at least one time
22      0      0      0             Always no
23      1      0      0 yes at least one time
24      0      0      1 yes at least one time
25      0      1      1 yes at least one time

Upvotes: 1

Martin Gal
Martin Gal

Reputation: 16988

You could use a lookup vector with 1:3 being the column indices of your carac's.

lookup <- c("TRUE" = "yes at least one time", "FALSE" = "Always no")
data$carac_all <- lookup[paste0(apply(data[, 1:3], 1, sum) > 0)]

Upvotes: 0

TarJae
TarJae

Reputation: 79142

3. Update: using dplyr 1.0.4: if_any() and if_all()

data %>% 
    mutate(
        carac_all = case_when(
            if_all(contains("carac"), ~. < 1) ~ "Always no",
            if_any(contains("carac"), ~. >= 1) ~ "yes at least one time",
            TRUE ~ NA_character_))

2. UPdate: Thanks to the valuable comment of Martin Gal:

data %>% 
    mutate(carac_all = case_when(
        rowSums(across(carac1:carac3)) < 1 ~ "Always no",
        rowSums(across(carac1:carac3)) >=1 ~ "yes at least one time",
        TRUE ~ NA_character_))

Update: after clarification:

data %>% 
    mutate(sum_carac = carac1+carac2+carac3) %>% 
    mutate(carac_all = case_when(
        sum_carac < 1 ~ "Always no",
        sum_carac >=1 ~ "yes at least one time",
        TRUE ~ NA_character_)) %>% 
    select(-sum_carac)
  carac1 carac2 carac3             carac_all
1        0      1      0 yes at least one time
2        0      0      0             Always no
3        1      0      1 yes at least one time
4        1      1      0 yes at least one time
5        0      1      0 yes at least one time
6        1      0      1 yes at least one time
7        1      0      1 yes at least one time
8        1      0      0 yes at least one time
9        1      1      0 yes at least one time
10       0      1      1 yes at least one time
11       0      1      1 yes at least one time
12       0      1      0 yes at least one time
13       1      0      1 yes at least one time
14       0      0      1 yes at least one time
15       1      0      1 yes at least one time
16       0      0      0             Always no
17       1      1      1 yes at least one time
18       1      0      1 yes at least one time
19       0      0      1 yes at least one time
20       1      1      0 yes at least one time
21       1      1      0 yes at least one time
22       0      0      0             Always no
23       1      0      0 yes at least one time
24       0      0      1 yes at least one time
25       0      1      1 yes at least one time
26       0      0      1 yes at least one time
27       0      1      0 yes at least one time
28       0      0      0             Always no
29       1      0      0 yes at least one time
30       0      1      1 yes at least one time
31       0      1      0 yes at least one time
32       1      0      0 yes at least one time
33       0      0      0             Always no
34       0      1      1 yes at least one time
35       1      1      0 yes at least one time
36       1      1      1 yes at least one time
37       1      1      1 yes at least one time
38       0      1      1 yes at least one time
39       1      1      0 yes at least one time
40       0      1      0 yes at least one time
41       1      1      0 yes at least one time
42       1      1      1 yes at least one time
43       1      0      1 yes at least one time
44       1      0      0 yes at least one time
45       1      1      0 yes at least one time
46       1      0      0 yes at least one time
47       0      0      0             Always no
48       0      1      0 yes at least one time
49       1      0      0 yes at least one time
50       1      1      1 yes at least one time
51       0      1      1 yes at least one time
52       1      1      1 yes at least one time
53       0      0      0             Always no
54       0      0      1 yes at least one time
55       0      1      0 yes at least one time
56       0      0      0             Always no
57       0      1      0 yes at least one time
58       1      0      0 yes at least one time
59       1      0      0 yes at least one time
60       0      0      1 yes at least one time
61       1      0      1 yes at least one time
62       0      1      0 yes at least one time
63       0      0      0             Always no
64       0      1      1 yes at least one time
65       1      1      1 yes at least one time
66       0      0      0             Always no
67       0      0      0             Always no
68       1      0      1 yes at least one time
69       0      1      1 yes at least one time
70       1      0      0 yes at least one time
71       0      1      0 yes at least one time
72       1      1      0 yes at least one time
73       0      1      0 yes at least one time
74       0      0      0             Always no
75       0      0      0             Always no
76       1      1      0 yes at least one time
77       1      1      0 yes at least one time
78       0      1      0 yes at least one time
79       1      1      0 yes at least one time
80       1      1      1 yes at least one time
81       0      0      0             Always no
82       1      0      1 yes at least one time
83       0      1      1 yes at least one time
84       0      1      0 yes at least one time
85       1      1      0 yes at least one time
86       0      0      0             Always no
87       1      1      0 yes at least one time
88       0      1      0 yes at least one time
89       0      1      0 yes at least one time
90       0      1      0 yes at least one time
91       0      1      0 yes at least one time
92       0      0      0             Always no
93       1      0      1 yes at least one time
94       1      1      0 yes at least one time
95       1      0      1 yes at least one time
96       1      1      1 yes at least one time
97       0      0      0             Always no
98       0      1      0 yes at least one time
99       1      0      0 yes at least one time
100      1      1      1 yes at least one time

First answer: We could use across from dplyr package

library(dplyr)
data %>% 
    mutate(across(starts_with("carac"), ~case_when(
        . == 0 ~ "Always no",
        . == 1 ~ "yes at least one time",
        TRUE ~ NA_character_), .names ="x_{.col}")) %>% 
        select(carac1:x_carac1)
 carac1 carac2 carac3             carac_all
1        0      1      0 yes at least one time
2        0      0      0             Always no
3        1      0      1 yes at least one time
4        1      1      0 yes at least one time
5        0      1      0 yes at least one time
6        1      0      1 yes at least one time
7        1      0      1 yes at least one time
8        1      0      0 yes at least one time
9        1      1      0 yes at least one time
10       0      1      1 yes at least one time
11       0      1      1 yes at least one time
12       0      1      0 yes at least one time
13       1      0      1 yes at least one time
14       0      0      1 yes at least one time
15       1      0      1 yes at least one time
16       0      0      0             Always no
17       1      1      1 yes at least one time
18       1      0      1 yes at least one time
19       0      0      1 yes at least one time
20       1      1      0 yes at least one time
21       1      1      0 yes at least one time
22       0      0      0             Always no
23       1      0      0 yes at least one time
24       0      0      1 yes at least one time
25       0      1      1 yes at least one time
26       0      0      1 yes at least one time
27       0      1      0 yes at least one time
28       0      0      0             Always no
29       1      0      0 yes at least one time
30       0      1      1 yes at least one time
31       0      1      0 yes at least one time
32       1      0      0 yes at least one time
33       0      0      0             Always no
34       0      1      1 yes at least one time
35       1      1      0 yes at least one time
36       1      1      1 yes at least one time
37       1      1      1 yes at least one time
38       0      1      1 yes at least one time
39       1      1      0 yes at least one time
40       0      1      0 yes at least one time
41       1      1      0 yes at least one time
42       1      1      1 yes at least one time
43       1      0      1 yes at least one time
44       1      0      0 yes at least one time
45       1      1      0 yes at least one time
46       1      0      0 yes at least one time
47       0      0      0             Always no
48       0      1      0 yes at least one time
49       1      0      0 yes at least one time
50       1      1      1 yes at least one time
51       0      1      1 yes at least one time
52       1      1      1 yes at least one time
53       0      0      0             Always no
54       0      0      1 yes at least one time
55       0      1      0 yes at least one time
56       0      0      0             Always no
57       0      1      0 yes at least one time
58       1      0      0 yes at least one time
59       1      0      0 yes at least one time
60       0      0      1 yes at least one time
61       1      0      1 yes at least one time
.........

Upvotes: 2

Related Questions