Reputation: 79
I have a function with two loops involved and the results is two lists of data.
The structure
function (){
for (i in 1:50){
for (j in 1:100){
"Do something"
"get results a and b"
a list
b list
}
"use the series of a and b calculate two parameter A and B"
"put A and B into their list"
list A = append(list A, A)
'or'list B = cbind(list B, B) # I don't know which one is better
}
plot the figure using list A and B
"saving the results"
dataframe = df(listA, listB)
dataframe to csv
}
The code needs simulate 5000 times and each step takes at least 1 minutes:
lapply
but it only works well with one loop, if I do so the results is not consistent and the plot can not work, i.e. I cannot get the results;and I find some parallel code can not work on Windows and some cannot work on Mac, I am confused with those ...
Each steps in the loop is independently so one alternative way I thought is just divide the jobs to do them simultaneously, but I need the results constantly (as the order they should be).
The way I save the results is looks like a mess. For example, what I want is:
A B
0 0
0.1 1
1.2 4
3 9
6 12
... ...
but what I got is:
V1
0 0 0.1 1 1.2 4 3 9 6 12 ... ...
I don't know how to save two columns data from parallel programming.
Upvotes: 1
Views: 435
Reputation: 2956
I like using the foreach
package for tasks like this (check the documentation). This function is like a for loop, but it works on a cluster. So each for iteration is done separately and is combined afterwards. I made a small example with the structure you are using. You can modify this for your task.
library(foreach)
library(doParallel)
#number of your cluster precessors, i choosed 4
cl <- makeCluster(4)
registerDoParallel(cl)
# use for z=1:10 your range, the .combine declares how to combine your dataframe afterwrads,
#.inorder makes sure it's sorted and the values are in the right order (TRUE is default)
df<-foreach(z = 1:10, .combine=rbind, .inorder=TRUE) %dopar%{
list_b = list()
list_a = list()
for (i in 1:50){
for (j in 1:100){
#some random task you are doing
a = i
b = 50-i
}
#combining them
list_b= cbind(list_b, b)
list_a= cbind(list_a, a)
}
#make sure you return the values, otherwise they don't get combined by foreach
return(do.call(rbind, Map(data.frame, A=list_a, B=list_b)))
}
#foreach returns nested lists, so you can change it to a dataframe easily
df= as.data.frame(df)
View(df)
stopCluster(cl)
Upvotes: 2