MLE
MLE

Reputation: 1043

add a vector to all rows of a matrix

I am maximizing a likelihood function and trying to reduce the loop. I want to add the vector(parameters to be estimated) to all rows of a matrix (data). The length of vector equals to the column of matrix. a+b would give the wrong results because the recycle rule of R is by column not row.

a<-c(1,2,0,0,0)  # parameters to be optimized
b<-matrix(1,ncol=5,nrow=6) # data
t(a+t(b)) # my code would work, anything more intuitive?

Desired output

        [,1] [,2] [,3] [,4] [,5]
    [1,]    2    3    1    1    1
    [2,]    2    3    1    1    1
    [3,]    2    3    1    1    1
    [4,]    2    3    1    1    1
    [5,]    2    3    1    1    1
    [6,]    2    3    1    1    1

The wrong output

a+b
    [,1] [,2] [,3] [,4] [,5]
[1,]    2    3    1    1    1
[2,]    3    1    1    1    2
[3,]    1    1    1    2    3
[4,]    1    1    2    3    1
[5,]    1    2    3    1    1
[6,]    2    3    1    1    1

Upvotes: 13

Views: 12848

Answers (3)

nisetama
nisetama

Reputation: 8863

Another benchmark:

es=2:7

r=sapply(es,function(e){
  m=matrix(rnorm(10*10^e),ncol=10)
  v=rnorm(10)
  b=microbenchmark(times=10,
    t(t(m)+v),
    v[col(m)]+m,
    m+rep(v,each=nrow(m)),
    sweep(m,2,v,"+"),
    m+outer(rep(1,nrow(m)),v),
    collapse::TRA(m,v,"+"),
    Rfast::eachrow(m,v,"+")
  )
  a=aggregate(b$time,list(b$expr),median)
  setNames(a[,2],gsub(" ","",a[,1]))/1e6
})

r2=apply(r,2,function(x)formatC(x,max(0,2-ceiling(log10(min(x,na.rm=T)))),format="f"))
r3=apply(rbind(paste0("1e",es),r2),2,function(x)formatC(x,max(nchar(x)),format="s"))
writeLines(apply(cbind(r3,c("",rownames(r))),1,paste,collapse=" "))

Output (median time in ms, 1e7 means 1e7 rows):

   1e2   1e3  1e4  1e5 1e6  1e7 
0.0102 0.116 1.10 13.4  96 1478 t(t(m)+v)
0.0053 0.105 0.98  9.3  65 1225 v[col(m)]+m
0.0182 0.197 1.93 19.7 173 2044 m+rep(v,each=nrow(m))
0.0397 0.151 1.16 12.9  93 1431 sweep(m,2,v,"+")
0.0088 0.053 0.46  3.1  41  610 m+outer(rep(1,nrow(m)),v)
0.0049 0.037 0.32  3.1  16  402 collapse::TRA(m,v,"+")
0.0043 0.036 0.31  3.3  13  382 Rfast::eachrow(m,v,"+")

Upvotes: 1

Sebastian
Sebastian

Reputation: 1369

These solutions using outer() or collapse::TRA() are is significantly faster than using rep or sweep.

Upvotes: 3

akrun
akrun

Reputation: 887108

We can use col to replicate the 'a' elements

b + a[col(b)]
#     [,1] [,2] [,3] [,4] [,5]
#[1,]    2    3    1    1    1
#[2,]    2    3    1    1    1
#[3,]    2    3    1    1    1
#[4,]    2    3    1    1    1
#[5,]    2    3    1    1    1
#[6,]    2    3    1    1    1

Or a faster option would be to use rep

b + rep(a, each = nrow(b))

Or use sweep

sweep(b, 2, a, "+")

Benchmarks

set.seed(24)
b <- matrix(sample(0:9, 8000*5000, replace=TRUE), ncol=5000)
a <- sample(0:3, 5000, replace=TRUE)
system.time(b + a[col(b)])
#  user  system elapsed 
#  1.08    0.06    1.14 
system.time(b + rep(a, each = nrow(b)))
#   user  system elapsed 
#   0.83    0.03    0.86 

system.time(t(a+t(b)))
#   user  system elapsed 
#   1.14    0.03    1.17 

system.time(sweep(b, 2, a, "+"))
#  user  system elapsed 
#  0.62    0.06    0.69 

Upvotes: 18

Related Questions