Hasse1987
Hasse1987

Reputation: 333

why is filling a row of a matrix much slower than filling vector of the same size

Why is assigning to a row/column of a matrix much slower than filling vector of the same size?

N <- 1e8
u <- numeric(N)
system.time(u <- 1:N)
##user  system elapsed 
##0.091   0.100   0.191 
u <- matrix(nrow=2,ncol=N)
system.time(u[1,] <- 1:N)
##user  system elapsed 
##2.772   0.436   3.212 

My main concern is with arrays, but I imagine the answer for matrices will explain arrays as well.

N <- 1e3
uu <- matrix(N,N)
system.time({uu <- 1:(N*N)})
##user  system elapsed 
##0.068   0.120   0.188 
uu <- array(dim=c(2,N,N))
system.time({uu[1,,] <- 1:(N*N)})
##user  system elapsed 
##4.409   0.241   4.657 

(I could not tell if assigning to an element of a large pre-allocated vector is slower than assigning to a variable, because the former always takes 0 time for any vector my system is capable of allocating.)

Upvotes: 1

Views: 52

Answers (1)

Zheyuan Li
Zheyuan Li

Reputation: 73385

Because you are filling in a row not a column. You get more cache misses, as well as address arithmetic overhead (locating base address for each column).

In R, a matrix, or a general array is essentially a long vector; dimension is just an attribute. The fastest index is the rightmost one, so elements are contiguous when you scan the matrix by its column, or an array by its last margin. This means, on a typical machine with 64 bytes L1 cache line size which can hold 8 double precision numeric, you have 1 cache miss per scanning 8 elements. But if you access a matrix of two rows by row for example, you get 1 cache miss per 4 elements.

Another issue, giving you an unfair comparison, is you are comparing overwriting with assignment, not overwriting with overwriting. Using u[] <- 1:N will force an overwriting to the preallocated vector u.

Consider the following benchmarking:

library(microbenchmark)
n <- 10000
u <- 1:n
m1 <- matrix(0, n, 2)
m2 <- matrix(0, 2, n)
x <- numeric(n)
microbenchmark (m1[,1] <- u, m2[1,] <- u, x[] <- u)

Upvotes: 2

Related Questions