SMM
SMM

Reputation: 193

Row-wise count of values that fulfill a condition

I want to generate a new variable which the number of times some columns satisfy a criterion (like ==, <, >). The function needs to handle NA.

Sample data with some missing values:

x <- seq(10, 20)
y <- seq(12, 22)
y[4] <- NA
z <- seq(14, 24)
z[c(4,5)] <- NA
data <- cbind(x, y, z)
#        x  y  z
# [1,]  10 12 14
# [2,]  11 13 15
# [3,]  12 14 16
# [4,]  13 NA NA
# [5,]  14 16 NA
# [6,]  15 17 19
# [7,]  16 18 20
# [8,]  17 19 21
# [9,]  18 20 22
# [10,] 19 21 23
# [11,] 20 22 24

In this example, I want is a variable, "less16", that sums up the number of values in each row that are < 16, across columns "x", "y" and "z". Desired result for the first few rows:

 x   y   z  less16
10  12  14       3
11  13  15       3
12  14  16       2
13  NA  NA       1
14  16  NA       1
etc

I've tried rowSum, sum, which, for loops using if and else, all to no avail so far. Any advice would be greatly appreciated. Thanks in advance.

Upvotes: 4

Views: 8812

Answers (2)

Jesse Anderson
Jesse Anderson

Reputation: 4603

rowSums has the argument na.rm:

data$less16 <- rowSums(data < 16, na.rm = TRUE)

Upvotes: 8

joran
joran

Reputation: 173657

A lot of these functions actually have a na.rm parameter for excluding NA values:

apply(data,1,function(x) {sum(x < 16,na.rm = TRUE)})

Upvotes: 6

Related Questions