Connor Uhl
Connor Uhl

Reputation: 75

Skip iteration and return NA in nested for loop in R

Given the data frame:

test <- structure(list(IDcount = c(1, 1, 1, 1, 1, 2, 2, 2, 2, 2), year = c(1, 
2, 3, 4, 5, 1, 2, 3, 4, 5), Otminus1 = c(-0.28, -0.28, -0.44, 
-0.27, 0.23, -0.03, -0.06, -0.04, 0, 0.02), N.1 = c(NA, -0.1, 
0.01, 0.1, -0.04, -0.04, -0.04, -0.04, -0.05, -0.05), N.2 = c(NA, 
NA, -0.09, 0.11, 0.06, NA, -0.08, -0.08, -0.09, -0.09), N.3 = c(NA, 
NA, NA, 0.01, 0.07, NA, NA, -0.12, -0.13, -0.13), N.4 = c(NA, 
NA, NA, NA, -0.04, NA, NA, NA, -0.05, -0.05), N.5 = c(NA, NA, 
NA, NA, NA, NA, NA, NA, NA, -0.13)), row.names = c(NA, -10L), groups = structure(list(
    IDcount = c(1, 2), .rows = structure(list(1:5, 6:10), ptype = integer(0), class = c("vctrs_list_of", 
    "vctrs_vctr", "list"))), row.names = 1:2, class = c("tbl_df", 
"tbl", "data.frame"), .drop = TRUE), class = c("grouped_df", 
"tbl_df", "tbl", "data.frame"))

and a results data frame:

results <- structure(list(IDcount = c(1, 2), N.1 = c(NA, NA), N.2 = c(NA, 
NA), N.3 = c(NA, NA), N.4 = c(NA, NA), N.5 = c(NA, NA)), row.names = c(NA, 
-2L), class = "data.frame")

I would like to perform a nested for loop like the following:

index <- colnames(test) %>% str_which("N.")

betas <- matrix(nrow=length(unique(test$IDcount)), ncol=2)
colnames(betas) <- c("Intercept", "beta")

for (j in colnames(test)[index]) {
  
  for (i in 1:2) {
    
    betas[i,] <- coef(lm(Otminus1~., test[test$IDcount==i, c("Otminus1", j)]))
  }
  
  betas <- data.frame(betas)
  
  results[[j]] <- betas$beta
}

The for loop is supposed to run the regression across each column and each ID and write the coefficients into the data frame "results". This works, as long as each ID has one value in each column. Unfortunately, my data frame "test" is missing values in the column "N.5". The regression and loop can therefore not be performed since all values for this ID are NA.

I now would like to adapt my loop so that iterations are only performed if there is at least one non-NA value for a certain ID in the specific column.

Following this explanation R for loop skip to next iteration ifelse, I tried to implement the following:

for (j in colnames(test)[index]) {
  
  for (i in 1:2) {
    
    if(sum(is.na(test[which(test[,1]==i),.]))==length(unique(test$year))) next
    
    betas[i,] <- coef(lm(Otminus1~., test[test$IDcount==i, c("Otminus1", j)]))
  }
  
  betas <- data.frame(betas)
  
  results[[j]] <- betas$beta
}

But this doesn't work.

I would like to receive a data frame "results" looking something like this:

IDcount  N.1    N.2   N.3   N.4    N.5
 1       0.1    0.2   0.5    0.3   NA
 2      -5,3   -0.8  -0.4   -0.1  -0.1

Any help would be greatly appreciated!!

Upvotes: 1

Views: 587

Answers (1)

Ronak Shah
Ronak Shah

Reputation: 389175

You can use colSums to perform a check :

index <- colnames(test) %>% str_which("N.")

betas <- matrix(nrow=length(unique(test$IDcount)), ncol=2)
colnames(betas) <- c("Intercept", "beta")

for (j in colnames(test)[index]) {
  
  for (i in 1:2) {
    tmp <- test[test$IDcount==i, c("Otminus1", j)]
    if(any(colSums(!is.na(tmp)) == 0)) next
    betas[i,] <- coef(lm(Otminus1~., tmp))
  }
  betas <- data.frame(betas)
  results[[j]] <- betas$beta
}

Upvotes: 2

Related Questions