Unai Vicente
Unai Vicente

Reputation: 379

for inside foreach parallel not populating a dataframe in R

I am having an issue populating a foreach. Suppose I have the following dataframe, the consequence of this dataframe is exactly what my real one looks like:

Elec2 <- rep(rep(rep(27:1, each = 81), each = 18), times = 100)
Ind <- rep(1:18, times = 218700)
Cond <- rep(1:3, times = 1312200)
Trial <- rep(rep(1:100, each = 2187), each = 18)
DVAR <- rbeta (3936600, 0.7, 1,5)

data <- cbind(DVAR, Ind, Cond, Trial, Elec1, Elec2)

I am trying the following code of parallelisation:

distinct_pairs <- 
  data %>% 
  select(Elec1, Elec2) %>%
  distinct()

  cl <- makeCluster(2) #values here are adjusted to cores, used 2 for the example
  registerDoParallel(cl)

output <- foreach (i = 1:nrow(distinct_pairs), .packages='glmmTMB', 
                    .combine = rbind,
                    .errorhandling="pass",
                    .verbose = T) %dopar% {
  dep <- distinct_pairs[i,]
  dat1 <- subset(data, dep$Elec1 == data$Elec1 & dep$Elec2 == data$Elec2)
  
  df[i,]$Elec1 <- dep[i,]$Elec1
  df[i,]$Elec2 <- dep[i,]$Elec2

  for (j in 1:18) { #By individual
    
    dat2 <- subset(dat1, dat1$Ind==j)
    model <- glmmTMB(DVAR ~ Cond, family=beta_family('logit'), data=dat2)
    results <- summary(model)
    
    est <-  results$coefficients$cond[2,1]
    ste <-  results$coefficients$cond[2,2]
    df[j,] <- c(est,ste)
    
  }
  return(df)
  }
  
  output <- as.data.frame(output, row.names = FALSE)

As you can see I am expecting a dataframe with the results of the iterations est & ste plus the identification of the electrodes Elec1 & Elec2. If I run the lines independently one by one it seems to work fine, but i cannot make it work the way I expect.

First loop takes a pair of electrodes, every row in distinct_pairs is a pair of electrodes, numbered from 1 to 27 for Elec1 and for Elec2.

Problem is I am unable to get the data of the for loop written in the final output dataframe.

I am sure the problem is pretty basic, but I appreciate any insight as I seem to be missing something.

Thanks!

[[UPDATE WITH SOLUTION]]

In case anyone is interested, here is the solution.

output <- foreach (i = 1:10, .packages='glmmTMB', 
         .combine = rbind,
         .errorhandling="pass",
         .inorder = TRUE,
         .verbose = T) %dopar% {
           
  dat1 <- subset(data, distinct_pairs[i,]$Elec1 == data$Elec1 & distinct_pairs[i,]$Elec2 == data$Elec2)
  df <- data.frame('Elec1'=rep(distinct_pairs[i,]$Elec1,18),'Elec2'=rep(distinct_pairs[i,]$Elec2,18),'est'=rep(NA,18),'ste'=rep(NA,18))
  
  for (j in 1:18) {
    
    dat2 <- subset(dat1, dat1$Ind==j)
    model <- glmmTMB(DVAR ~ Condition, family=beta_family('logit'), data=dat2)
    results <- summary(model)
    
    est <-  results$coefficients$cond[2,1]
    ste <-  results$coefficients$cond[2,2]
    df[j,c('est','ste')] <- c(est,ste)
    
   }
  return(df)
  }

Which returns exactly what I was looking for:

> head(output)
  Elec1 Elec2          est        ste
1     1     1  0.034798615 0.03530296
2     1     1 -0.005363760 0.03392442
3     1     1 -0.017349123 0.03404430
4     1     1 -0.034819068 0.03196078
5     1     1  0.002301062 0.03163825
6     1     1  0.003575131 0.03452420

Upvotes: 0

Views: 81

Answers (1)

chrizzle
chrizzle

Reputation: 36

I am definetly not sure if I got the problem, could you also provide an Elec1 in your data Example?

An idea: Foreach might not find df, you could create the data frame at the beginning of your loop with something like

df <- data.frame('Elec1'=rep(NA,18),'Elec2'=rep(NA,18),'est'=rep(NA,18),'ste'=rep(NA,18))

maybe add then below in the for loop: df[j,c('est','ste')] <- c(est,ste)

Upvotes: 1

Related Questions