Carrol
Carrol

Reputation: 1285

R: doParallel foreach with several data frames outputs

I have a function that needs to manipulate three data frames, all with different structure:

In order to try the parallel processing, I sat up a minimal code (following this question and this blog) in which I only generated b:

# Set up the parallel
registerDoParallel( makeCluster(3L) )

b <- foreach(i = 1:nrow(f), .combine = rbind) %dopar% {
  tempB <- do_something_function()

  tempB
}

That example works perfectly, but I'm missing two data frames. I found other answers, but I do believe my case is different:

I could change a to be a data frame of rows that would later be removed, but I need to merge all tempA with only tempA... if that makes any sense. In the previous questions I linked, they mix all of the outputs.

Upvotes: 2

Views: 3529

Answers (2)

F. Priv&#233;
F. Priv&#233;

Reputation: 11738

It seems that your problem has nothing to do with parallelism, but rather about combining the results.

An example of solution of how I would do it (which I think is the most efficient way to do it):

library(foreach)
tmp <- foreach(i = seq_len(32)) %do% {
  list(iris[i, ], mtcars[i, ], iris[i, ])
}

lapply(purrr::transpose(tmp), function(l) do.call(rbind, l))

Upvotes: 2

Carrol
Carrol

Reputation: 1285

I found this solution so far. Instead of removing from a, I'm creating a data frame that is the rows that will be deleted. I wrote a combine function:

combine <- function(x, ...) {  
  mapply(rbind, x, ..., SIMPLIFY = FALSE)
}

And my loop is something like this:

# Set up the parallel
registerDoParallel( makeCluster(3L) )

# Loop
output <- foreach(i = 1:nrow(f), .combine = combine, .multicombine = TRUE) %dopar% {
  tempA <- get_this_value()
  tempB <- do_something_function()
  tempC <- get_this_other_frame()

  # Return the values
  list(tempA, tempB, tempC)
}

Then, I access the data using output[[1]] and so on. However, for this solution I'll still have to do a setdiff or anti_join after the loop, to remove the "undesired" rows from a.

Upvotes: 0

Related Questions