Reputation: 1519
I am trying to speed up a Monte Carlo simulation of a discrete time-inhomogeneous Markov chain using data.table or some form of parallelisation. Using random dummy transition matrices TM, I am simulating nSteps time steps in each of N simulations and starting from a an initial state vector initialState record the next updated state in currentState. At each time step, I matrix multiply the current state with the transition matrix TM.
Code 1 with loop
nStates <- 5 #number of states
initialState <- c(rep(1/nStates, nStates)) #vector with uniform initial states
nSteps <- 10 #number of time steps
N <- 10000 #number of simulations
ind.arr <- matrix(1:(N*nSteps),ncol=nSteps, byrow=TRUE)
currentState <- vector("list",(N*(nSteps))) #collects the nSteps state vectors for each simulation
system.time(
for (i in 1:N) {
TM <- matrix(runif(nStates^2), ncol=nStates) #random transition matrix for each time step and each simulation
currentState[[(ind.arr[i,1])]] <- initialState %*% (TM / rowSums(TM)) #/rowSums(TM) ensures that TM is a transition matrix
for (t in 2:nSteps){
TM <- matrix(runif(nStates^2), ncol=nStates)
currentState[[(ind.arr[i,t])]] <- currentState[[(ind.arr[i,t-1])]] %*% (TM / rowSums(TM))
}
})
The code is not super slow, but I am wondering whether avoiding the N-loop can accelerate the code. If I put the body of the N-loop in a function
statefun <- function(initialState, nSteps, nStates){
TM <- matrix(runif(nStates^2), ncol=nStates) #random transition matrix for each time step and each simulation
currentState <- matrix(rep(NA, nSteps*nStates), ncol=nStates)
currentState[1,] <- initialState %*% (TM / rowSums(TM)) #/rowSums(TM) ensures that TM is a transition matrix
for (t in 2:nSteps){
TM <- matrix(runif(nStates^2), ncol=nStates)
currentState[t,] <- currentState[t-1,] %*% (TM / rowSums(TM))
}
return(currentState)
}
and use data.table, I get an error and not the desired result
library(data.table)
system.time(dt <- data.table(i=1:N)[, c("s1", "s2", "s3", "s4", "s5") := list(statefun(initialState, nSteps, nStates)), by=i])
#As each simulation run is independent and the call of statefun is expensive, I was hoping that parallelisation helps to accelerate the code, but trying foreach is actually slower than where I started.
library(foreach)
system.time(res <- foreach(i=1:N, .combine='c') %do% statefun(initialState, nSteps, nStates))
I appreciate any comment on how to make data.table work or using parallelisation in this case. Many thanks, Tim
@ EDIT:This one doesn't pick up the ten row output of the function call...
system.time( #does not work
dt <- data.table(i=1:N)[,c("s1", "s2", "s3", "s4", "s5"):=as.list(statefun(initialState, nSteps, nStates)),by=i]
)
Upvotes: 1
Views: 512
Reputation: 19677
If you convert the outer for loop into a foreach loop with 10,000 tasks, the performance isn't good because the tasks are too small. It's often better to make the number of tasks equal to the number of workers. Here's a simple way to do that using the idiv
function from the iterators
package:
library(doParallel)
nw <- 4
cl <- makePSOCKcluster(nw)
registerDoParallel(cl)
nStates <- 5
initialState <- c(rep(1/nStates, nStates))
nSteps <- 10
N <- 10000
currentState <- foreach(n=idiv(N, chunks=nw), .combine='c') %dopar% {
ind.arr <- matrix(1:(n * nSteps), ncol=nSteps, byrow=TRUE)
cur <- vector("list", n * nSteps)
for (i in 1:n) {
TM <- matrix(runif(nStates^2), ncol=nStates)
cur[[ind.arr[i,1]]] <- initialState %*% (TM / rowSums(TM))
for (t in 2:nSteps) {
TM <- matrix(runif(nStates^2), ncol=nStates)
cur[[(ind.arr[i,t])]] <-
cur[[(ind.arr[i,t-1])]] %*% (TM / rowSums(TM))
}
}
cur
}
Instead of simply parallelizing the outer for loop, this adds a foreach loop around a smaller version of the sequential code. So if you figure out a way to improve the sequential code you can easily use that in the parallel version. You may also get better performance by not returning all of the intermediate states.
Upvotes: 2
Reputation: 3278
There is an example in this thread that could suit your needs. You would need to use replicate, from the lapply function in base.
replicate(N, statefun(initialState, nSteps, nStates))
Upvotes: 0