Reputation: 5
So I want to create a new conditional variable like the one below. Basically, I want a variable that signifies any positive result in a list of other variables. I've been trying to use case_when but having no luck.
variable 1 | variable 2 | varaible 3 | New variable |
---|---|---|---|
1 | 0 | 0 | 1 |
0 | 0 | 1 | 1 |
0 | 1 | 1 | 1 |
0 | 0 | 0 | 0 |
Upvotes: 0
Views: 261
Reputation: 887213
We can use Reduce
with |
in base R
so that any value not equal to 0 will be TRUE for and 0 FALSE which does the elementwise comparison for each row and returns TRUE if there is at least one non-zero, then we coerce the logical to binary with +
(TRUE -> 1, FALSE -> 0)
df$new_variable <- +(Reduce(`|`, df))
df <- structure(list(variable1 = c(1L, 0L, 0L, 0L), variable2 = c(0L,
0L, 1L, 0L), variable3 = c(0L, 1L, 1L, 0L)), row.names = c(NA,
-4L), class = "data.frame")
Upvotes: 1
Reputation: 26218
using cur_data()
in dplyr
library(dplyr)
df %>% mutate(new_v = +(rowSums(cur_data()) > 0))
#> variable1 variable2 variable3 new_v
#> 1 1 0 0 1
#> 2 0 0 1 1
#> 3 0 1 1 1
#> 4 0 0 0 0
Created on 2021-06-08 by the reprex package (v2.0.0)
Upvotes: 2
Reputation: 21918
I hope I understood what you were looking for correctly. I created new_var
variable based on the presence of any positive value in a row:
library(dplyr)
df %>%
rowwise() %>%
mutate(new_var = +any(c_across(everything()) > 0, na.rm = TRUE))
# A tibble: 4 x 4
# Rowwise:
variable1 variable2 variable3 new_var
<int> <int> <int> <int>
1 1 0 0 1
2 0 0 1 1
3 0 1 1 1
4 0 0 0 0
Upvotes: 2
Reputation: 1298
You can use pmap_dbl
to apply an if_else statement that checks whether any values of var1
, var2
or var3
are positive. This solution works no matter what the numeric values of the above variables are.
library(tidyverse)
# reproduce your data
mydata <- tibble(
var1 = c(1,0,0,0),
var2 = c(0,0,1,0),
var3 = c(0,1,1,0)
)
mydata %>%
mutate(
newvar = pmap_dbl(list(var1, var2, var3), ~ if_else(any(c(..1, ..2, ..3) > 0), 1, 0))
)
Upvotes: 0
Reputation: 389047
You can find the max value in each row.
df$new_variable <- do.call(pmax, df)
df
# variable1 variable2 variable3 new_variable
#1 1 0 0 1
#2 0 0 1 1
#3 0 1 1 1
#4 0 0 0 0
data
df <- structure(list(variable1 = c(1L, 0L, 0L, 0L), variable2 = c(0L,
0L, 1L, 0L), variable3 = c(0L, 1L, 1L, 0L), new_variable = c(1L,
1L, 1L, 0L)), row.names = c(NA, -4L), class = "data.frame")
Upvotes: 1