MOHAMMED
MOHAMMED

Reputation: 400

R: Writing double summations in matrix form

enter image description here

I would like to get rid of the summation over $t$ and implement that in R. So far, I have this

Z_diff = mapply(FUN = function(i) {Z-Z[i]}, i=1:n, SIMPLIFY = TRUE)
rss = 0
for(t in 1:n){
    rss = rss + t(Y - X %*% B[,t]) %*% (diag(c(Z_diff[,t])) %*% (Y - X %*% B[,t]))
  }

Upvotes: 1

Views: 483

Answers (1)

G. Grothendieck
G. Grothendieck

Reputation: 269734

Below we define a test input and compute the rss using the original formula in the question. Then we show a matrix formula that gives the same result. If you write out the formulas for the components of the matrix in the sum of the last line the reason for the result should be clear.

# input
set.seed(123)
n <- 4
Z <- rnorm(n)
Y <- rnorm(n)
B <- matrix(rnorm(n*n), n)
X <- matrix(rnorm(n*n), n)

# scalar sum
rss <- 0
for(i in 1:n) 
  for(t in 1:n) 
    rss <- rss + (Y[i] - c( crossprod(X[i, ], B[, t])))^2 * (Z[t] - Z[i])
rss
## [1] 43.39269

# matrix sum
ZZ <- t(outer(Z, Z, "-"))
sum(ZZ * (Y - X%*%B)^2)
## [1] 43.39269

Performance

With the data above the matrix formulation runs about 300x faster on my machine.

library(microbenchmark)

microbenchmark(
scalar = {
  rss <- 0
  for(i in 1:n) 
    for(t in 1:n) 
      rss <- rss + (Y[i] - c( crossprod(X[i, ], B[, t])))^2 * (Z[t] - Z[i])
  rss
},
matrix = {
  ZZ <- t(outer(Z, Z, "-"))
  sum(ZZ * (Y - X%*%B)^2)
})
+ })
## Unit: microseconds
##    expr     min       lq      mean   median      uq      max neval cld
##  scalar 34428.2 36966.35 44244.423 38257.80 40968.6 407679.6   100   b
##  matrix    80.8    82.45   148.764   156.45   177.8    721.7   100  a 

Upvotes: 3

Related Questions