Reputation: 319
I was using foreach to do parallel compuation in R,
no_cores <- detectCores()
registerDoParallel(no_cores)
temp <- foreach(i=320:530,
.combine = rbind) %dopar% {
track(data = data[i,], para = currenttime)
}
but I realised that some CPU cores were not being utilised, let alone full used.
Are there some setting I missed? Are there some solutions I think about improving usage rate to speed up the running?
Upvotes: 1
Views: 1874
Reputation: 19667
Some thoughts on this:
You may only have 4 physical cores but 8 logical cores because hyperthreading is enabled on your computer. Your problem may only be able to make good use of 4 cores. If so, you might be getting worse performance by starting 8 workers. In that case, it may be better to use:
no_cores <- detectCores(logical=FALSE)
track
may not be very compute intensive, possibly due to excessive I/O or memory operations, causing it to not use much CPU time.
If track
is CPU intensive but doesn't take much time to execute (less than a millisecond, for example), the master process may become a bottleneck, especially if track
returns a lot of data.
Possible solutions:
Verify that your computer has enough memory to support the workers that you start by using your computer's process monitoring tools. If necessary, reduce the number of workers to stay within your resources.
You might get better results by using chunking techniques so there is only one task per worker. This makes the workers more efficient and reduces the post-processing done by the master.
Try experimenting with foreach
options such as .maxcombine
. Setting it to be greater than the number of tasks may help.
Combining the results by row isn't as efficient as combining by column, but this may not be a problem if you're chunking.
Upvotes: 2