Reputation: 59
I have a have r code which can be simplified into simple version as shown below.
cl <- parallel::makeCluster(2, type="SOCK")
b<-data.frame(c(1,1,2,2,3,3,4,4,7,7,9,9,11,11,12,12,13,13,14,14))
colnames(b)<-c("col1")
b_uni<-unique(b)
clusterExport(cl,"b_uni")
bbb <- parallel::parLapply(cl,1:nrow(b_uni), fun=function(i,b) {
e<-b[b$col2==b_uni[i,1],]
a<-e+10
return(a)
}b=b)
c <- na.omit(do.call(rbind, bbb))
In order to minimize number of loops, i am running only unique combinations in in b. But the variable bbb and c are not getting populated.
Upvotes: 1
Views: 278
Reputation: 6813
You haven't passed the object b
to your parLapply()
. In lapply
you can access objects in the global environment, in parLapply()
you have to pass them. So if you change your code to this:
bbb <- parallel::parLapply(cl,1:nrow(b_uni), fun=function(i,b) {
e<-b[b$col2==b_uni[i,1],]
a<-e+10
return(a)
}, b = b)
it will work.
EDIT:
The reason bbb
is empty is because b
does not have a column called col2
.
bbb <- parallel::parLapply(cl,1:nrow(b_uni), fun=function(i,b) {
e<-b[b$col1==b_uni[i,1],]
a<-e+10
return(a)
}, b = b)
If you change it to col1
it will a list of vectors of length 2:
lengths(bbb)
[1] 2 2 2 2 2 2 2 2 2 2
Upvotes: 3