Reputation: 435
This is a question about writing logical conditions efficiently.
Let's say I want to recode a variable if any column in a set equals a particular value.
test <- tibble(
CompanyA = rep(c(0:1),5),
CompanyB = rep(c(0),10),
CompanyC = c(1,1,1,1,0,0,1,1,1,1)
)
test
A basic way would be:
test$newvar <- ifelse(test$CompanyA==1 | test$CompanyB == 1 | test$CompanyC == 1,-99,0)
table(test$newvar)
But what if I have several dozen columns? I don't want to write out CompanyA
, CompanyB
, etc. Is there a way of essentially using an %in
type statement? Here's an obviously wrong method:
condition <- columns %in% c("CompanyA", "CompanyB", "CompanyC") . # obviously doesn't work
test$newvar[condition] <- 1
Or is that an even simpler way to do this - e.g., if CompanyA:CompanyC == 1, then do...
?
Upvotes: 1
Views: 80
Reputation: 2920
By reshaping test
from long to wide, I was able to create a column that tests if any of the values in your CompanyX
columns contain a value of 1.
# load necessary packages ----
library(tidyverse)
# load necessary data ----
test <-
tibble(CompanyA = rep(c(0:1),5),
CompanyB = rep(c(0),10),
CompanyC = c(1,1,1,1,0,0,1,1,1,1)) %>%
# create an 'id' column
mutate(id = 1:n())
# calculations -----
new.var <-
test %>%
# transfrom data from long to wide
gather(key = "company", value = "value", -id) %>%
# for each 'id' value
# test if any 'value' is equal to 1
# if so, return -99; else return 0
group_by(id) %>%
summarize(new_var = if_else(any(value == 1), -99, 0))
# left join new.var onto test ---
test <-
test %>%
left_join(new.var, by = "id")
# view results ---
test
# A tibble: 10 x 5
# CompanyA CompanyB CompanyC id new_var
# <int> <dbl> <dbl> <int> <dbl>
# 1 0 0 1 1 -99
# 2 1 0 1 2 -99
# 3 0 0 1 3 -99
# 4 1 0 1 4 -99
# 5 0 0 0 5 0
# 6 1 0 0 6 -99
# 7 0 0 1 7 -99
# 8 1 0 1 8 -99
# 9 0 0 1 9 -99
# 10 1 0 1 10 -99
# end of script #
Upvotes: 1