Reputation: 490
I have a project where I will have a large matrix where I randomly select cells to modify based on the surrounding cells. I'm trying to figure out how to optimize this as much as possible before moving to RCpp (or in addition to). There's other code I have for dealing with the edges of the matrix and whatnot, but this minimal example gives you the idea of what I'm going for. In this example, I randomly pick either the original content of the cell or one of the neighbors to take on the value of the original cell. I've come up with three ways of doing this - pick_random1
works with apply
, pick_random2
works with mapply
, and pick_random3
works with a for
loop. To get the *apply
functions to work, I need to use a global variable, which I know is not ideal. Surprisingly, those also turn out to be the slowest option.
pick_random1 <- function(row_column){
row <- row_column[1]
column <- row_column[2]
neighbors <- my_mat[(row-1):(row+1), (column-1):(column+1)]
my_mat[row,column] <<- sample(neighbors, 1)
}
pick_random2 <- function(row, column){
neighbors <- my_mat[(row-1):(row+1), (column-1):(column+1)]
my_mat[row,column] <<- sample(neighbors, 1)
}
pick_random3 <- function(row, column, l_matrix){
neighbors <- l_matrix[(row-1):(row+1), (column-1):(column+1)]
sample(neighbors, 1)
}
my_mat <- matrix(1:25, nrow=5)
my_mat
rows <- c(2,3,4)
cols <- c(2,3,4)
benchmark(apply(cbind(rows, cols), 1, pick_random1), replications=100000)$elapsed
#[1] 5.346
benchmark(mapply(pick_random2, rows, cols), replications=100000)$elapsed
#[1] 3.551
benchmark(
for(i in 1:length(rows)){
my_mat[rows[i],cols[i]] <- pick_random3(rows[i], cols[i], l_matrix=my_mat)
}, replications=100000)$elapsed
#[1] 2.419
Any suggestions on what I can do to speed this up further or how to do this without using a for
loop? I should add that it is important to be able to update the matrix sequentially rather than all at once.
Upvotes: 1
Views: 428
Reputation: 132706
I assume you want to pick the neighbors from the initial state and not iteratively (because then you couldn't avoid a loop).
my_mat <- matrix(1:25, nrow=5)
rows <- c(2,3,4)
cols <- c(2,3,4)
set.seed(42)
#find neighbor
neighbor <- arrayInd(sample(1:9, length(rows), replace = TRUE), c(3L,3L)) - 2
# [,1] [,2]
#[1,] 1 1
#[2,] 1 1
#[3,] 1 -1
#replace
my_mat[cbind(rows, cols)] <- my_mat[cbind(rows, cols) + neighbor]
# [,1] [,2] [,3] [,4] [,5]
#[1,] 1 6 11 16 21
#[2,] 2 13 12 17 22
#[3,] 3 8 19 18 23
#[4,] 4 9 14 15 24
#[5,] 5 10 15 20 25
If you need this iteratively, you need a loop for, and only for, the last step:
for (i in seq_along(rows))
my_mat[rows[i], cols[i]] <- my_mat[rows[i] + neighbor[i, 1],
cols[i] + neighbor[i, 2]]
Upvotes: 1