AntonyDW
AntonyDW

Reputation: 349

How to speed up R code to move matrix elements in a specific direction - Rccp?

I have developed code that moves elements around a large matrix to simulate hydrology conditions. This part of the code is a function that is called several times for an hour's simulation, and as I am trying to model 30 years worth of data ends up being time expensive.

Within the original matrix called flowingrunoff (2710 rows by 7153 columns) some of the cells move East, some move SouthEast, some move South etc etc (i.e. to one of the 8 compass points). This has been determined earlier and saved in 8 matrices of the same size consisting of 1 if the cell moves in that direction and 0 if not (these are called fdrecells, fdrsecells, fdrscells etc etc). Also created earlier is a matrix consisting of all 0s that is two rows and two columns larger than the other matrices (called emptymovematrix).

Now to the function that moves the cells that I have called route(). Here is the code, and it employs set.submatrix to move matrices into a specific part of emptymovematrix as follows:

route <- function() {    

flowingrunoff[is.na(flowingrunoff)] <- 0
flowingbaseflow[is.na(flowingbaseflow)] <- 0

fdrerunoff <- fdrecells*flowingrunoff
movefdrerunoff <- set.submatrix(emptymovematrix,fdrerunoff,2,3)

fdrserunoff <- fdrsecells*flowingrunoff
movefdrserunoff <- set.submatrix(emptymovematrix,fdrserunoff,3,3)

fdrsrunoff <- fdrscells*flowingrunoff
movefdrsrunoff <- set.submatrix(emptymovematrix,fdrsrunoff,3,2)

fdrswrunoff <- fdrswcells*flowingrunoff
movefdrswrunoff <- set.submatrix(emptymovematrix,fdrswrunoff,3,1)

fdrwrunoff <- fdrwcells*flowingrunoff
movefdrwrunoff <- set.submatrix(emptymovematrix,fdrwrunoff,2,1)

fdrnwrunoff <- fdrnwcells*flowingrunoff
movefdrnwrunoff <- set.submatrix(emptymovematrix,fdrnwrunoff,1,1)

fdrnrunoff <- fdrncells*flowingrunoff
movefdrnrunoff <- set.submatrix(emptymovematrix,fdrnrunoff,1,2)

fdrnerunoff <- fdrnecells*flowingrunoff
movefdrnerunoff <- set.submatrix(emptymovematrix,fdrnerunoff,1,3)

Therefore 8 new matrices have been created. These can be combined now to find the sum of the flowing runoff after the cells have been moved around and then trimmed down to the size of the original matrix.

movedrunoff <- Reduce(function(x,y){x + y}, list(movefdrerunoff,movefdrserunoff,movefdrsrunoff,movefdrswrunoff,movefdrwrunoff,movefdrnwrunoff,movefdrnrunoff,movefdrnerunoff))

movedrunoff <- movedrunoff[2:(nrow(movedrunoff)-1),2:(ncol(movedrunoff)-1)]

}

This works and is significantly (3 to 4 times) faster than my original method which employed rbind and cbind like so:

fdrerunoff <- fdrecells*flowingrunoff
movefdrerunoff <- cbind(0,0,fdrerunoff)
movefdrerunoff <- rbind(0,movefdrerunoff,0)

However, although this code (together with some similar code that moves around baseflow) only takes between 6 and 10 seconds per cycle, there are between 5 and 10 cycles per hour simulated - you can see that this adds up significantly in time. The whole code takes approximately 70 seconds per hour which over 30 years would take approx 212 days to complete. For info the route() function is called from another function called routing() like so:

routing <- function () {

final <- replicate(replication,route())

}

So, any ideas how to improve this code or the strategy. I have tried shift.right (in Matrix), I have tried converting to raster and shifting the extent, I have tried rbind and cbind but all are much slower than the above method. I believe the real key to speeding this up would be employing calling code in C++ possibly using Rccp, but would not know where to start. So there are any slicker R techniques to move matrices around, or if you know how to code this in C++ then please let me know.

I'm still surprised that multiplying matrices is quite quick in R, but moving or copying them or using anything like matrix[] is relatively slow. Note that I am using Revolutionary R with the Intel Math Kernel Library installed.

Many thanks

Antony

Upvotes: 0

Views: 165

Answers (1)

Shape
Shape

Reputation: 2952

I would recommend an adjacency matrix / transition matrix approach. That way you get the power out of %*%

create two transition matrixes transition1 <- diag(7153) transition2 <- diag(2710)

the first governs transitions in one axis, the second in the other axis

then edit the rows/columns of the transitions you want to make then for each transition your_matrix %*% transition1 and transition2 %*% your_matrix should give you the moved values.

EDIT: Order is important if you do it this way, if you want to do both at the same time, then you would need to grab reshape2, melt your original data.frame into long format, and make a square matrix of (2710*7153) which might be too big for your needs. But if you're modeling flow, you may be able to get away with this sort of two-step approximation

EDIT2: You can get around the order actually if you just do transition2 %*% your_matrix + your_matrix %*% transition1 as long as you make sure that you don't lose any flow

EDIT3: So, here's a small 6x8 example (I assume you have already broken it up into your 8 directions, I'm assuming you're not allowing for fluid to remain where it is)

basemat <- matrix(rep(c(1,1,0,0,0,0,0,0),6), nrow = 6)

transition_east <- diag(8)
transition_east <- transition_east[c(2:8,1),]

So the transition matrix here is set up to do something very simple, it's moving all columns one to the east. (it wraps at the edges)

Then try: basemat

basemat %*% transition_east

basemat %*% transition_east %*% transition_east

etc...

Upvotes: 2

Related Questions