Reputation: 75
Given the data frame:
test <- structure(list(IDcount = c(1, 1, 1, 1, 1, 2, 2, 2, 2, 2), year = c(1,
2, 3, 4, 5, 1, 2, 3, 4, 5), Otminus1 = c(-0.28, -0.28, -0.44,
-0.27, 0.23, -0.03, -0.06, -0.04, 0, 0.02), N.1 = c(NA, -0.1,
0.01, 0.1, -0.04, -0.04, -0.04, -0.04, -0.05, -0.05), N.2 = c(NA,
NA, -0.09, 0.11, 0.06, NA, -0.08, -0.08, -0.09, -0.09), N.3 = c(NA,
NA, NA, 0.01, 0.07, NA, NA, -0.12, -0.13, -0.13), N.4 = c(NA,
NA, NA, NA, -0.04, NA, NA, NA, -0.05, -0.05), N.5 = c(NA, NA,
NA, NA, NA, NA, NA, NA, NA, -0.13)), row.names = c(NA, -10L), groups = structure(list(
IDcount = c(1, 2), .rows = structure(list(1:5, 6:10), ptype = integer(0), class = c("vctrs_list_of",
"vctrs_vctr", "list"))), row.names = 1:2, class = c("tbl_df",
"tbl", "data.frame"), .drop = TRUE), class = c("grouped_df",
"tbl_df", "tbl", "data.frame"))
and a results data frame:
results <- structure(list(IDcount = c(1, 2), N.1 = c(NA, NA), N.2 = c(NA,
NA), N.3 = c(NA, NA), N.4 = c(NA, NA), N.5 = c(NA, NA)), row.names = c(NA,
-2L), class = "data.frame")
I would like to perform a nested for loop like the following:
index <- colnames(test) %>% str_which("N.")
betas <- matrix(nrow=length(unique(test$IDcount)), ncol=2)
colnames(betas) <- c("Intercept", "beta")
for (j in colnames(test)[index]) {
for (i in 1:2) {
betas[i,] <- coef(lm(Otminus1~., test[test$IDcount==i, c("Otminus1", j)]))
}
betas <- data.frame(betas)
results[[j]] <- betas$beta
}
The for loop is supposed to run the regression across each column and each ID and write the coefficients into the data frame "results". This works, as long as each ID has one value in each column. Unfortunately, my data frame "test" is missing values in the column "N.5". The regression and loop can therefore not be performed since all values for this ID are NA.
I now would like to adapt my loop so that iterations are only performed if there is at least one non-NA value for a certain ID in the specific column.
Following this explanation R for loop skip to next iteration ifelse, I tried to implement the following:
for (j in colnames(test)[index]) {
for (i in 1:2) {
if(sum(is.na(test[which(test[,1]==i),.]))==length(unique(test$year))) next
betas[i,] <- coef(lm(Otminus1~., test[test$IDcount==i, c("Otminus1", j)]))
}
betas <- data.frame(betas)
results[[j]] <- betas$beta
}
But this doesn't work.
I would like to receive a data frame "results" looking something like this:
IDcount N.1 N.2 N.3 N.4 N.5
1 0.1 0.2 0.5 0.3 NA
2 -5,3 -0.8 -0.4 -0.1 -0.1
Any help would be greatly appreciated!!
Upvotes: 1
Views: 587
Reputation: 389175
You can use colSums
to perform a check :
index <- colnames(test) %>% str_which("N.")
betas <- matrix(nrow=length(unique(test$IDcount)), ncol=2)
colnames(betas) <- c("Intercept", "beta")
for (j in colnames(test)[index]) {
for (i in 1:2) {
tmp <- test[test$IDcount==i, c("Otminus1", j)]
if(any(colSums(!is.na(tmp)) == 0)) next
betas[i,] <- coef(lm(Otminus1~., tmp))
}
betas <- data.frame(betas)
results[[j]] <- betas$beta
}
Upvotes: 2