Is there a way to identify rows that match a condition several times across several columns in R?

Question

I have a dataset of questionnaires filled by patients. I want to identify them using diagnostic criteria; the criteria I'm struggling with requires at least 3 answers of >= 3 (questions are Likert questions from 1 up to 5).

A MWE of the dataset I'm working on is presented below

data <- structure(list(q1 = c(1, 2, 3, 1, 1, 1, 1, 3, 1, 1), q2 = c(1, 
 1, 3, 1, 1, 1, 1, 3, 1, 1), q3 = c(1, 1, 1, 1, 3, 3, 1, 1, 
 1, 1), q4 = c(1, 2, 2, 1, 1, 3, 1, 3, 1, 1), q5 = c(1, 1, 
 3, 1, 1, 1, 1, 1, 1, 1)), row.names = c(NA, -10L), class = c("tbl_df", 
 "tbl", "data.frame"))

I've figured out how to identify observations that match at least 1 value >=3 using (I do not use all_vars as my dataset is larger than the MWE:

data.match <- data %>%
   filter_at(vars(q1, q2, q3, q4, q5), any_vars(. %in% c(3:5)))
data$diagnostic <- ifelse(data$id %in% data.match$id,1,0)

I then back-identified patients using the second line. The thing is I've not been able to replicate such a strategy to identify patients meeting a determined number of pre-specified values across columns. In this specific example, I'd like to identify patients 3 and 8. I've tried using rowSums but it seems to me that the number of possible combinations is too high.

Ronak Shah · Accepted Answer

Using dplyr, you could use rowwise with c_across :

library(dplyr)

result <- data %>%
  rowwise() %>%
  mutate(diagnostic = as.integer(sum(c_across(starts_with('q')) >= 3) >= 3)) 

result

#      q1    q2    q3    q4    q5 diagnostic
#             
# 1     1     1     1     1     1          0
# 2     2     1     1     2     1          0
# 3     3     3     1     2     3          1
# 4     1     1     1     1     1          0
# 5     1     1     3     1     1          0
# 6     1     1     3     3     1          0
# 7     1     1     1     1     1          0
# 8     3     3     1     3     1          1
# 9     1     1     1     1     1          0
#10     1     1     1     1     1          0

Is there a way to identify rows that match a condition several times across several columns in R?

Answers (2)

Related Questions