Shakil Ahmed Shaon
Shakil Ahmed Shaon

Reputation: 119

How do I combine 6 categorical (yes/no) variables into 1 umbrella categorical variable (yes/no)

I have a dataset like this following example:

bleed  breathing  ascites   spleen  Hepato
Yes      Yes         No       Yes    No
No       Yes         No       Yes    No
No       No          Yes      Yes    No

I need to create a new umbrella variable that describes the 5 categorical variables. As long as a patient has a "Yes" (complication) in ANY of the 5 categorical variables that he/she should get a "Yes" in the new umbrella variable (i.e. they belong to the new category). Only if the person has a "No" in ALL of the 5 categories should they get a "No" in the new umbrella category.

Thanks in advance.

Upvotes: 1

Views: 928

Answers (1)

akrun
akrun

Reputation: 887048

We can use rowSums on a logical matrix to get the sum on each TRUE/FALSE outcome. Create a logical vector > 0, add 1 to it so that TRUE/FALSE, converts to 2/1 and this can be used as index to replace with a vector of new values ("No", "Yes")

df1$umbrella <- c("No", "Yes")[(rowSums(df1 == "Yes") > 0) + 1]
df1$umbrella
#[1] "Yes" "Yes" "Yes"

Or another option is Reduce with lapply

df1$umbrella <- c("No", "Yes")[(Reduce(`|`, lapply(df1, `==`, "Yes"))) + 1]

Or with apply

c("No", "Yes")[1 + apply(df1 == "Yes", 1, FUN = any)]

data

df1 <- structure(list(bleed = c("Yes", "No", "No"), breathing = c("Yes", 
"Yes", "No"), ascites = c("No", "No", "Yes"), spleen = c("Yes", 
"Yes", "Yes"), Hepato = c("No", "No", "No")), class = "data.frame", 
row.names = c(NA, 
-3L))

Upvotes: 2

Related Questions