rowsums with multiple conditions

Question

I'm trying to calculate cumulative sum in rows with several variables.

This is my data as example. I have 5 patients ID, and 4 condition variables. If there is value between '1 to 3' in conditions, cumsum will be added 1.

ID<-c("a","b","c","d","e")
cond1<-as.factor(sample(x=1:7,size=5,replace=TRUE))
cond2<-as.factor(sample(x=1:7,size=5,replace=TRUE))
cond3<-as.factor(sample(x=1:7,size=5,replace=TRUE))
cond4<-as.factor(sample(x=1:7,size=5,replace=TRUE))
df<-data.frame(ID,cond1,cond2,cond3,cond4)
df

  ID cond1 cond2 cond3 cond4
1  a     2     7     6     6
2  b     7     2     3     6
3  c     4     3     1     4
4  d     7     3     3     6
5  e     6     7     7     3

I use rowSums code with following statement. However, as 2nd row, though cond2 is 2 and cond3 is 3, the cumsum was not '2', '1'. 4nd row has same problem.

df$cumsum<-rowSums(df[,2:5]==c(1,2,3),na.rm=TRUE)
df
  ID cond1 cond2 cond3 cond4 cumsum
1  a     2     7     6     6      0
2  b     7     2     3     6      1
3  c     4     3     1     4      1
4  d     7     3     3     6      1
5  e     6     7     7     3      0

How to make it cumulative? I would really appreciate all your help.

akrun · Accepted Answer

For more than 1 element comparison, use %in%, but %in% works on a vector. So, we loop through the columns with lapply/sapply and then do the rowSums on the logical matrix

df$RSum <- rowSums(sapply(df[,2:5], `%in%`, 1:3))
df$RSum
#[1] 1 2 2 2 1

If the values were numeric, then we could also make use of > or <

df$RSum <- rowSums(df[, 2:5] >=1 & df[, 2:5] <=3)

data

df <- structure(list(ID = c("a", "b", "c", "d", "e"), cond1 = c(2L, 
7L, 4L, 7L, 6L), cond2 = c(7L, 2L, 3L, 3L, 7L), cond3 = c(6L, 
3L, 1L, 3L, 7L), cond4 = c(6L, 6L, 4L, 6L, 3L)), 
class = "data.frame", row.names = c("1", 
"2", "3", "4", "5"))

rowsums with multiple conditions

Answers (2)

data

Related Questions