pingu87
pingu87

Reputation: 113

fastest way to filter row and column in R

I have a big matrix (x) with about 1'500'000 rows and 7000 columns. I would like to filter out with R:

I tried the following code but it takes too long to run (it is stuck on the first command):

x_rows <- apply(x, 1, function(y) sum(y>1))
x_column <- apply(x, 2, function(y) sum(y>1))
x_f <- x[x_rows>5, x_column>100]

Thank you in advance

Upvotes: 1

Views: 180

Answers (1)

GKi
GKi

Reputation: 39647

Try rowSums and colSums:

set.seed(42)
x <- matrix(rnorm(150000*7000), 150000, 7000)

system.time({
  x_rows <- rowSums(x>1)
  x_column <- colSums(x>1)
  x_f <- x[x_rows>5, x_column>100]
})
#       User      System verstrichen 
#     10.534       1.620      12.158

Upvotes: 4

Related Questions