Reputation: 44648
library(doParallel)
library(RMySQL)
no_cores <- as.integer(system('getconf _NPROCESSORS_ONLN', intern = TRUE)) - 1
cluster <- makeCluster(no_cores)
registerDoParallel(cl)
clusterEvalQ(
cluster,
mysql <- RMySQL::dbConnect(...)
}
)
r <- foreach(i = 1:50, .verbose = TRUE) %dopar% { dbGetQuery(mysql, 'show tables;')}
no variables are automatically exported
There's no error, no complaint. Nothing, it just freezes. I can start and use a cluster without database connections.
Thoughts?
Upvotes: 1
Views: 587
Reputation: 19677
When does it hang? When calling clusterEvalQ
or the foreach loop?
I have a few suggestions:
outfile=""
when creating the cluster to get debug output;RMySQL
when initializing the cluster;NULL
from clusterEvalQ
to avoid serializing connection objects;registerDoParallel
so the tasks aren't executed locally.Here's a test that uses these suggestions:
library(doParallel)
cl <- makePSOCKcluster(3, outfile="")
registerDoParallel(cl)
clusterEvalQ(cl, {
library(RMySQL)
mysql <- dbConnect(MySQL(), user='root',
password='notmypasswd', dbname='mysql')
NULL
})
r <-
foreach(i=1:50, .verbose=TRUE) %dopar% {
dbGetQuery(mysql, 'show tables;')
}
This test works for me. When I run it, I see messages like:
no variables are automatically exported
numValues: 50, numResults: 0, stopped: TRUE
got results for task 1
numValues: 50, numResults: 1, stopped: TRUE
returning status FALSE
got results for task 2
If you only see:
no variables are automatically exported
and then it hangs, then the workers are presumably hanging trying to perform the query using the database connection. That sounds like a MySQL problem to me, but I'm not a MySQL expert.
Upvotes: 5