persephone
persephone

Reputation: 420

R memory-efficient squaring of a sparse matrix

Suppose I have a large sparse matrix and want to square it.

However, m %*% m takes up alot of memory, e.g. 20-40 Gb.

Are there more memory efficient ways to do it? E.g. packages, etc.?

Upvotes: 1

Views: 140

Answers (1)

GKi
GKi

Reputation: 39737

Maybe sparseMatrix from Matrix will be memory efficient.

library(Matrix)
m <- sparseMatrix(1:5, 1:5, x=1:5)
m %*% m
#m^2 #Alternative
#5 x 5 sparse Matrix of class "dgCMatrix"
#                
#[1,] 1 . .  .  .
#[2,] . 4 .  .  .
#[3,] . . 9  .  .
#[4,] . . . 16  .
#[5,] . . .  . 25

Another option could be the slam package:

library(slam)
m <- simple_triplet_matrix(1:5, 1:5, 1:5)
m^2

or spray:

library(spray)
m <- spray(cbind(1:5, 1:5), 1:5)
m$value <- m$value^2

Comparing the methods:

n <- 1e3
bench::mark(check = FALSE
          , base = {m <- diag(1:n); m %*% m}
          , base2 = {m <- diag(1:n); m^2}
          , Matrix = {m <- sparseMatrix(1:n, 1:n, x=1:n); m %*% m}
          , Matrix2 = {m <- sparseMatrix(1:n, 1:n, x=1:n); m^2}
          , slam = {m <- simple_triplet_matrix(1:n, 1:n, 1:n); m^2}
          , spray = {m <- spray(cbind(1:n, 1:n), 1:n); m$value <- m$value^2; m}
            )
#  expression      min   median `itr/sec` mem_alloc `gc/sec` n_itr  n_gc
#  <bch:expr> <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl> <int> <dbl>
#1 base       711.65ms 711.65ms      1.41    26.7MB     1.41     1     1
#2 base2        2.48ms   2.66ms    236.      11.4MB    76.7    123    40
#3 Matrix     687.17µs 712.63µs   1379.      63.1KB     4.00   690     2
#4 Matrix2    610.89µs 637.48µs   1541.      55.2KB     4.00   771     2
#5 slam         36.1µs  39.18µs  15309.      19.7KB     4.00  7652     2
#6 spray        2.49ms   2.76ms    338.      81.3KB     5.97   170     3

It looks like that slam is most efficient in memory usage and also in speed.

Upvotes: 2

Related Questions