Reputation: 119
I have a dataset like this following example:
bleed breathing ascites spleen Hepato
Yes Yes No Yes No
No Yes No Yes No
No No Yes Yes No
I need to create a new umbrella variable that describes the 5 categorical variables. As long as a patient has a "Yes" (complication) in ANY of the 5 categorical variables that he/she should get a "Yes" in the new umbrella variable (i.e. they belong to the new category). Only if the person has a "No" in ALL of the 5 categories should they get a "No" in the new umbrella category.
Thanks in advance.
Upvotes: 1
Views: 928
Reputation: 887048
We can use rowSums
on a logical matrix
to get the sum on each TRUE/FALSE outcome. Create a logical vector > 0
, add 1 to it so that TRUE/FALSE, converts to 2/1 and this can be used as index to replace with a vector of new values ("No", "Yes")
df1$umbrella <- c("No", "Yes")[(rowSums(df1 == "Yes") > 0) + 1]
df1$umbrella
#[1] "Yes" "Yes" "Yes"
Or another option is Reduce
with lapply
df1$umbrella <- c("No", "Yes")[(Reduce(`|`, lapply(df1, `==`, "Yes"))) + 1]
Or with apply
c("No", "Yes")[1 + apply(df1 == "Yes", 1, FUN = any)]
df1 <- structure(list(bleed = c("Yes", "No", "No"), breathing = c("Yes",
"Yes", "No"), ascites = c("No", "No", "Yes"), spleen = c("Yes",
"Yes", "Yes"), Hepato = c("No", "No", "No")), class = "data.frame",
row.names = c(NA,
-3L))
Upvotes: 2