reviewer3
reviewer3

Reputation: 253

how to generate a vector or factor from a numeric matrix based on logic (no for loop)

I have a numeric matrix of 30,000 rows and 3 columns. I would like to generate a simple PASS/FAIL vector (or factor) based on the 3 values in each row of the matrix. I would like to apply the following logic:

If all 3 values in row > 3, enter PASS, else FAIL.

I know how to do this with a for loop, but how could I do it faster? I have dozens of these matrices... Thank you!

as.matrix(rbind(c(129,129,120),c(135,97,96),c(0,0,0),c(39,4,2)))

desired output: PASS, PASS, FAIL, FAIL

Upvotes: 3

Views: 211

Answers (5)

John
John

Reputation: 23758

Unlike other answers here, this uses rowSums but that's not looping in R and can outrun multiple subsets and logicals. It's probably the fastest route.

mat <- as.matrix(rbind(c(129,129,120),c(135,97,96),c(0,0,0),c(39,4,2)))

vec <- ifelse(rowSums(mat > 3) == 3, TRUE, FALSE)

We could also bypass ifelse and make it even faster.

vec <- rowSums(mat > 3) == 3

If you test these for time that will probably be the winner. On my system, using 30,000 row matrices, my first answer comes out about twice as fast as the gung answer and the second one comes out 10x as fast and can execute on 1000 30,000 row matrices in about 2 seconds. The Codoremifa answer is the fastest data.table based answer here and it takes 20s (similar to the gung answer).

NOTE: I kind of ignored your request for a "PASS", "FAIL" vector since you seemed to indicate speed was of paramount importance and it's a trivial semantic distinction. Furthermore, the logical vector is already prepared to subset the matrices if necessary.

Upvotes: 4

gung - Reinstate Monica
gung - Reinstate Monica

Reputation: 11893

For problems like this, my first inclination is to combine ?all, ?apply, & ?ifelse, perhaps like the solution @Ananda provides. As he mentions, apply() is using a loop. If you want a completely vectorized solution, you could try:

newVector <- ifelse((xMatrix[,1]>3 & xMatrix[,2]>3 & xMatrix[,3]>3), 
                    "PASS", "FAIL")

Vectorization is a handy feature of R, and it is much faster than loops. You can read about vectorization here.

Upvotes: 1

A5C1D2H2I1M1N2O1R2T1
A5C1D2H2I1M1N2O1R2T1

Reputation: 193517

Use all and apply (though apply is using it's own loops).

m <- as.matrix(rbind(c(129,129,120),c(135,97,96),c(0,0,0),c(39,4,2)))

apply(m, 1, function(x) all(x > 3))
# [1]  TRUE  TRUE FALSE FALSE

If you really want "PASS" and "FAIL" instead, you can factor the result of the apply step.

factor(apply(m, 1, function(x) all(x > 3)), 
       levels = c(FALSE, TRUE), 
       labels = c("FAIL", "PASS"))
# [1] PASS PASS FAIL FAIL
# Levels: FAIL PASS

Extending Codoremifa's answer a little, a similar approach works with data.table, especially since you specify that you want a vector or factor as the output.

library(data.table)
DT <- data.table(m)
DT[, all(.SD > 3), by = 1:nrow(DT)][, factor(V1, labels = c("FAIL", "PASS"))]
# [1] PASS PASS FAIL FAIL
# Levels: FAIL PASS

Upvotes: 5

alexis_laz
alexis_laz

Reputation: 13122

Also, mapply:

mat <- as.matrix(rbind(c(129,129,120),c(135,97,96),c(0,0,0),c(39,4,2)))

fun <- function(x, y, z) { ifelse(x > 3 & y > 3 & z > 3, "PASS", "FAIL") } 
mapply(fun, mat[,1], mat[,2], mat[,3])
#[1] "PASS" "PASS" "FAIL" "FAIL"

Upvotes: 2

TheComeOnMan
TheComeOnMan

Reputation: 12875

library(data.table)
dt <- as.matrix(rbind(c(129,129,120),c(135,97,96),c(0,0,0),c(39,4,2)))

dt <- data.table(dt)
dt[, Indicator :="FAIL"]
dt[V1 > 3 & V2 >3 & V3 >3, Indicator :="PASS" ]

Upvotes: 2

Related Questions