Reputation: 519
I would like to create a function that does a for loop to create multiple datasets. These datasets should be returned into a single dataset, which will be the output of my function.
I did the following code. It works when the for loop is outside the function, but it does not work when the loop is inside another function. The problem with my function, is that it only gives me back the first (i) dataset.
library(broom)
library(dplyr)
# My function
validation <- function(x, y) {
df <- NULL
for (i in 1:ncol(x)) {
coln <- colnames(x)[i]
covariate <- as.vector(x[,i])
models <- (tidy(glm(y ~ covariate, data = x, family = binomial)))
df <- (rbind(df, cbind(models, coln))) %>% filter( term != "(Intercept)")
return(df)
}
}
# Test function
validation(mtcars, mtcars$am)
term estimate std.error statistic p.value coln
covariate 0.3070282 0.1148416 2.673493 0.007506579 mpg
This function should give me the following output:
term estimate std.error statistic p.value coln
1 covariate 0.307028190 1.148416e-01 2.6734932353 0.007506579 mpg
2 covariate -0.691175096 2.536145e-01 -2.7252982408 0.006424343 cyl
3 covariate -0.014604292 5.167837e-03 -2.8259972293 0.004713367 disp
4 covariate -0.008117121 6.074337e-03 -1.3362973916 0.181452089 hp
5 covariate 5.577358500 2.062575e+00 2.7040753425 0.006849476 drat
6 covariate -4.023969940 1.436416e+00 -2.8013963535 0.005088198 wt
7 covariate -0.288189820 2.278968e-01 -1.2645629995 0.206028024 qsec
8 covariate 0.693147181 7.319250e-01 0.9470194188 0.343628884 vs
9 covariate 51.132135568 7.774641e+04 0.0006576784 0.999475249 am
10 covariate 21.006490452 3.876257e+03 0.0054192724 0.995676067 gear
11 covariate 0.073173343 2.254018e-01 0.3246350695 0.745457282 carb
Upvotes: 2
Views: 894
Reputation: 887048
If we change the return(df)
from the inner loop to outer, it should work because the 'df' return inside the inner loop is just the output just got updated i.e. the first run output
validation <- function(x, y) {
df <- NULL
for (i in 1:ncol(x)) {
coln <- colnames(x)[i]
covariate <- as.vector(x[,i])
models <- (tidy(glm(y ~ covariate, data = x, family = binomial)))
df <- (rbind(df, cbind(models, coln))) %>% filter( term != "(Intercept)")
# to understand it better, create some print statement
print(sprintf("column index : %d", i))
print('-----------------')
print('df in each loop')
print(df)
print(sprintf("%dth loop ends", i))
}
df
}
-checking
validation(mtcars, mtcars$am)
# term estimate std.error statistic p.value coln
#1 covariate 0.307028190 1.148416e-01 2.6734932353 0.007506579 mpg
#2 covariate -0.691175096 2.536145e-01 -2.7252982408 0.006424343 cyl
#3 covariate -0.014604292 5.167837e-03 -2.8259972293 0.004713367 disp
#4 covariate -0.008117121 6.074337e-03 -1.3362973916 0.181452089 hp
#5 covariate 5.577358500 2.062575e+00 2.7040753425 0.006849476 drat
#6 covariate -4.023969940 1.436416e+00 -2.8013963535 0.005088198 wt
#7 covariate -0.288189820 2.278968e-01 -1.2645629995 0.206028024 qsec
#8 covariate 0.693147181 7.319250e-01 0.9470194188 0.343628884 vs
#9 covariate 51.132135568 7.774641e+04 0.0006576784 0.999475249 am
#10 covariate 21.006490452 3.876257e+03 0.0054192724 0.995676067 gear
#11 covariate 0.073173343 2.254018e-01 0.3246350695 0.745457282 carb
Upvotes: 2