Khashaa
Khashaa

Reputation: 7373

Speeding up the evaluation of quadratic form

My question is yet another "Vectorize this!". Similar question appeared elsewhere (Efficient way of calculating quadratic forms: avoid for loops?), but somehow I can't seem to make it work for my case.

I want to calculate quadratic form x'Sxfor every p-dimensional observation x in the sample of size n. I couldn't figure out a nice, vectorized code, so my final resort is to do it by for loop. Following toy example is for p=2, n=100.

set.seed(123)
n <- 100
x1 <- rnorm(n)
x2 <- rnorm(n)
x <- cbind(x1,x2)
Sigma <- matrix(c(1, 2, 3, 4), ncol = 2)
z  <- rep(0, n)
for (i in 1:n) {
   z[i]  <- x[i, ] %*% solve(Sigma, x[i, ]) #quadratic form of x'S^{-1}x
}

Like many other R users who worship vectorized codes, the use of for loop caused emotional pain. In order to ease the pain, I modified my code using a couple of common vectorization technique.

ap <- function(Sigma, x) apply(x, 1, function(x) x %*% solve(Sigma, x))
lap <- function(Sigma, x) unlist(lapply(1:n, function(i) x[i, ] %*% solve(Sigma, x[i, ])))
loop <- function(Sigma, x){
  z  <- rep(0, n)
  for (i in 1:n) {
    z[i]  <- x[i, ] %*% solve(Sigma, x[i, ])
  }
  z
}

But the speed comparison shows nothing much is gained.

library(microbenchmark)
microbenchmark(lap(Sigma, x), ap(Sigma, x), loop(Sigma, x))

# Unit: milliseconds
#           expr      min       lq     mean   median       uq       max neval
#  lap(Sigma, x) 4.207434 4.444895 5.092389 4.616912 5.283504  8.440802   100
#   ap(Sigma, x) 4.360204 4.523306 5.317304 4.685396 5.412771 10.168674   100
# loop(Sigma, x) 4.518645 4.679317 6.204626 4.827831 5.438908 94.115144   100

Is there any room for improvement, or should I go to Rcpp for freeing myself from sin of using for loops?

Upvotes: 4

Views: 662

Answers (3)

Benjamin Christoffersen
Benjamin Christoffersen

Reputation: 4841

The following is not the case in your example here but might be in many other cases. Presume that your matrix is positive definite. E.g. let Sigma <- matrix(c(1, 1, 1, 4), ncol = 2). Then you can back solve after having computed a Cholesky decomposition which is numerically a better idea and also faster in this case:

lap <- function(Sigma, x) unlist(lapply(1:n, function(i) x[i, ] %*% solve(Sigma, x[i, ])))

bench::mark(lap = lap(Sigma, x),
            product = diag(x %*% solve(Sigma) %*%t(x)),
            `product smarter` = diag(x %*% tcrossprod(solve(Sigma), x)),
            colsum = colSums(t(x) * (solve(Sigma) %*% t(x))),
            `colsum (smarter)` = rowSums(x %*% solve(Sigma) * x),
            backsolve = {
              ch <- chol(Sigma)
              colSums(backsolve(ch, t(x), transpose = TRUE)^2)
            })
#R> # A tibble: 6 x 13
#R>   expression            min  median `itr/sec` mem_alloc `gc/sec` n_itr  n_gc total_time result  memory     time     gc       
#R>   <bch:expr>       <bch:tm> <bch:t>     <dbl> <bch:byt>    <dbl> <int> <dbl>   <bch:tm> <list>  <list>     <list>   <list>   
#R> 1 lap               800.1µs   878µs     1113.    1.66KB     12.7   527     6      474ms <dbl [… <Rprofmem… <bch:tm… <tibble …
#R> 2 product            30.4µs    32µs    29895.   83.92KB     53.9  9982    18      334ms <dbl [… <Rprofmem… <bch:tm… <tibble …
#R> 3 product smarter    28.2µs    30µs    32300.   82.31KB     61.5  9981    19      309ms <dbl [… <Rprofmem… <bch:tm… <tibble …
#R> 4 colsum             20.5µs  22.3µs    44005.    5.66KB     13.2  9997     3      227ms <dbl [… <Rprofmem… <bch:tm… <tibble …
#R> 5 colsum (smarter)   15.1µs  16.2µs    60101.    4.05KB     12.0  9998     2      166ms <dbl [… <Rprofmem… <bch:tm… <tibble …
#R> 6 backsolve            15µs  16.3µs    59162.    5.66KB     11.8  9998     2      169ms <dbl [… <Rprofmem… <bch:tm… <tibble …

Notice also the faster version of the colSums version posted in this answer where one avoid the transpose call.

Upvotes: 0

Khashaa
Khashaa

Reputation: 7373

Thank you very much for the concern, @konvas, @beginneR.

I was mistakenly calculating x'S^{-1}x whereas the answer given by @Ben Bolker in the above link calculates x'Sx. colSums(t(x) * (solve(Sigma) %*% t(x))) solved my problem altogether.

I am now even more devoted worshipper of the vectorized code.

microbenchmark(lap(Sigma, x), 
               product = diag(x %*% solve(Sigma) %*%t(x)), 
               colsum = colSums(t(x) * (solve(Sigma) %*% t(x)))
               )
#Unit: microseconds
#          expr      min        lq      mean    median        uq      max neval
# lap(Sigma, x) 4283.616 4384.9215 4961.9761 4475.6920 4700.3885 9472.096   100
#       product  126.835  134.3315  165.3030  144.0575  199.5725  306.349   100
#        colsum   92.391  102.9265  130.0202  110.8285  146.2855  346.061   100

Upvotes: 0

konvas
konvas

Reputation: 14346

If you store the rows of x in a list and use vapply instead of lapply, you can speed this up a bit as follows

# First, make a list of the rows of x
xl <- vector("list",nrow(x))
for (i in seq_along(xl)) xl[[i]] <- x[i, ] 

# Apply solve
solve.mat <- vapply(xl, solve, numeric(2), a = Sigma)
# Take the dot product of each pair of elements
result <- colSums(solve.mat * t(x))
all(result == lap(Sigma, x))
# [1] TRUE

Writing it in one step and comparing

library(microbenchmark)
microbenchmark(lap = lap(Sigma, x),
    csums = colSums(vapply(xl, solve, numeric(2), a = Sigma) * t(x)))
# Unit: milliseconds
#   expr      min       lq     mean   median       uq      max neval
#    lap 3.013343 3.050855 3.164558 3.097901 3.136355 4.206923   100
#  csums 2.224350 2.263772 2.354349 2.289751 2.317672 3.660294   100

Upvotes: 1

Related Questions