user2806363
user2806363

Reputation: 2593

How to filter out matrix rows with entries less than specific value

I'm dealing with very big matrix. I want just keep the rows of the matrix which 90% of it's entries are bigger than 10. Since I'm not familiar much with R, would someone help me to implement this ?

Upvotes: 1

Views: 2504

Answers (2)

nico
nico

Reputation: 51640

You can use apply and all to check which rows have all elements > 10

big.mat <- matrix(rnorm(1000000, 20, 8), 1000, 1000)
# Apply a function to each row of the matrix 
# (so we pass 1 to apply, 2 would be columns)
# all returns TRUE if all of the element of the vector we pass 
# to it are TRUE
good.lines <- apply(big.mat, 1, function(x){all(x>10)})
# Lines that have > 90% elements > 10
good.lines.90 <- apply(big.mat, 1, function(x){perc <- sum(x>10)/length(x); perc>0.9})

filtered.mat <- big.mat[good.lines,]
filtered.mat.90 <- big.mat[good.lines.90,]

Upvotes: 3

A5C1D2H2I1M1N2O1R2T1
A5C1D2H2I1M1N2O1R2T1

Reputation: 193517

I would just use rowSums and common comparison operators.

Here's a minimal example:

set.seed(1); m <- matrix(sample(50, 100, TRUE), ncol = 10)
rowSums(m > 10) == ncol(m)
#  [1]  TRUE FALSE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
m[rowSums(m > 10) == ncol(m), ]
#      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
# [1,]   14   11   47   25   42   24   46   17   22    12
# [2,]   29   35   33   25   40   22   23   18   20    33

To accommodate a fractional approach, try something like:

m[rowSums(m > 10) >= (.9 * ncol(m)), ]

Upvotes: 3

Related Questions