user7105520
user7105520

Reputation: 43

speed up vector multiplication

I would like to speed up the below R function. For each column in matrix 'A', find the index (not itself) for which its product with another element of the vector and the respective element in symmetric correlation matrix R is maximised.

Currently, there is some redundancy in calculating the outer product as it unnecessarily generates the full matrix. Also, the loop (i.e. 'apply') should ideally be vectorised.

Example data below.

    set.seed(123)
    A <- matrix(rexp(30000, rate=.1), nrow=3000, ncol=2000)/100
    R <- matrix( runif(10000), 3000 , 3000 )
    diag(R) <- 1
    R[upper.tri(R)] <- R[lower.tri(R)]

    function_which_is_too_slow <- function(index){
        aar <- outer(A[,index], A[,index]) * R
        diag(aar) <- 0
        return(max.col(aar, 'first'))
    }

    out <- apply(matrix(1:dim(A)[2]), 1, function_which_is_too_slow)

Upvotes: 1

Views: 217

Answers (1)

Ralf Stubner
Ralf Stubner

Reputation: 26833

Here your code as base line with a smaller problem size:

set.seed(123)
A <- matrix(rexp(30000, rate=.1), nrow=3000, ncol=40)/100
R <- matrix( runif(10000), 3000 , 3000 )
diag(R) <- 1
R[upper.tri(R)] <- R[lower.tri(R)]

function_which_is_too_slow <- function(index){
  aar <- outer(A[,index], A[,index]) * R
  diag(aar) <- 0
  return(max.col(aar, 'first'))
}

system.time(apply(matrix(1:dim(A)[2]), 1, function_which_is_too_slow))
#>        User      System verstrichen 
#>      12.001      11.388      10.348

Setting the diagonal to zero every time is unnecessary, if we use a copy of the correlation matrix with the diagonal set to zero. Using lapply instead of apply just looks nicer:

Rp <- R
diag(Rp) <- 0

faster_function <- function(index){
  aar <- outer(A[,index], A[,index]) * Rp
  return(max.col(aar, 'first'))
}
system.time(lapply(1:ncol(A), faster_function))
#>        User      System verstrichen 
#>      11.156      10.306       8.334

We can also use RcppArmadillo to do the same computations in C++

Rcpp::cppFunction(code = "
arma::uvec arma_function(const arma::mat &A, const arma::mat &Rp, int index) {
  arma::mat aar = A.col(index-1) * A.col(index-1).t() % Rp;
  return index_max(aar, 1) + 1;
} 
", depends ="RcppArmadillo")
system.time(lapply(1:ncol(A), arma_function, A = A, Rp = Rp))
#>        User      System verstrichen 
#>       7.382      10.578       4.874

And we can parallelize the computations, though RcppArmadillo already uses OpenMP if available:

system.time(parallel::mclapply(1:ncol(A), arma_function, A = A, Rp = Rp))
#>        User      System verstrichen 
#>       0.008       0.010       3.903

Overall, about 3 times faster, which isn't a lot.

Upvotes: 2

Related Questions