Reputation: 179
I've seen a few other posts on this topic, and none seemed to be quite the same as the problem I'm having. But here goes:
I'm running a function in parallel using
cores <- detectCores()
cl <- makeCluster(8L,outfile="output.txt")
registerDoParallel(cl)
x <- foreach(i = 1:length(y), .combine='list',.packages=c('httr','jsonlite'),
.multicombine=TRUE,.verbose=F,.inorder=F) %dopar% {function(y[i])}
This often works fine, but is now throwing the error:
Error in serialize(data, node$con) : error writing to connection
Upon examination of the output.txt file I see:
starting worker pid=11112 on localhost:11828 at 12:38:32.867
starting worker pid=10468 on localhost:11828 at 12:38:33.389
starting worker pid=4996 on localhost:11828 at 12:38:33.912
starting worker pid=3300 on localhost:11828 at 12:38:34.422
starting worker pid=10808 on localhost:11828 at 12:38:34.937
starting worker pid=5840 on localhost:11828 at 12:38:35.435
starting worker pid=8764 on localhost:11828 at 12:38:35.940
starting worker pid=7384 on localhost:11828 at 12:38:36.448
Error in unserialize(node$con) : embedded nul in string: '\0\0\0\006SYMBOL\0\004\0\t\0\0\0\003')'\0\004\0\t\0\0\0\004expr\0\004\0\t\0\0\0\004expr\0\004\0\t\0\0\0\003','\0\004\0\t\0\0\0\024SYMBOL_FUN'
Calls: <Anonymous> ... doTryCatch -> recvData -> recvData.SOCKnode -
unserialize
Execution halted
This error is intermittent. Memory is plentiful (32GB), and no other large R objects are in memory. The function in the parallel code retrieves a number of small json data objects from the cloud and puts them into an R object - so there are no large data files. I don't know why it occasionally sees an embedded nul and stops.
I have a similar problem with a function that pulls csv files from the cloud as well. Both functions worked fine under R 3.3.0 and R 3.4.0 until now.
I'm using R 3.4.1 and RStudio 1.0.143 on Windows.
Here's my sessionInfo
sessionInfo()
R version 3.4.1 (2017-06-30)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1
Matrix products: default
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United
States.1252 LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C LC_TIME=English_United States.1252
attached base packages:
[1] parallel stats graphics grDevices utils datasets methods base
other attached packages:
[1] RJSONIO_1.3-0 RcppBDT_0.2.3 zoo_1.8-0 data.table_1.10.4
doParallel_1.0.10 iterators_1.0.8
[7] RQuantLib_0.4.2 foreach_1.4.3 httr_1.2.1
loaded via a namespace (and not attached):
[1] Rcpp_0.12.12 lattice_0.20-35 codetools_0.2-15 grid_3.4.1
R6_2.2.2 jsonlite_1.5 tools_3.4.1
[8] compiler_3.4.1
UPDATE
Now I get another similar error:
Error in unserialize(node$con) : ReadItem: unknown type 100, perhaps written by later version of R
The embedded nul error seems to have vanished. I've also tried deleting .Rhistory and .Rdata and also deleting my packages subfolder and reloading all pacakges. At least this new error seems consistent. I can't find what "unknown type 100" is.
Upvotes: 6
Views: 8196
Reputation: 71
I also noticed that multi-core sessions don't go away from the task manager.
Switching from using:stopCluster(cl)
to stopImplicitCluster()
Worked for me. From my reading, this is supposed to be used when using a "one line" registerDoParallel(cores=x)
vs
cl<-makeCluster(x)
registerDoParallel(cl)
My "gut feeling" is that how Windows handles the clusters requires the stopImplicitCluster, but your experience may vary.
I would have commented but this is (cue band) MY FIRST STACKOVERFLOW POST!!!
Upvotes: 7
Reputation: 447
I get a similar error... usually happens on a subsequent script run when one of my previous scripts errored out or I stopped it early. This could be the part where you mention: " I don't know why it occasionally sees an embedded nul and stops" which could be the error.
This has some good info, especially to make sure to leave 1 core for regular windows processes to run. Also mentions "If you get an error from either of those functions, it usually means that at least one of the workers has died" which could back up my theory about crashing after error.
doParallel error in R: Error in serialize(data, node$con) : error writing to connection
So far, my solution has been to re-initialize the parallel backend by running this again:
registerDoParallel(cl)
It usually works afterwards but I do notice that the previous multi-core sessions in my task manager do not go away, even with:
stopCluster(cl)
This is why I sometimes restart R.
Upvotes: 6