Reputation: 631
Suppose I have a dataset like (forget about the distribution):
modData <- data.frame("A" = rnorm(20, 15, 3),
"B" = rnorm(20, 20, 3),
"C" = rnorm(20, 25, 3),
"X" = rnorm(20, 5, 1)
)
If I use X
as a predictor, A
, B
and C
as responses, respectively:
md1 <- lm(A ~ X, data = modData)
md2 <- lm(B ~ X, data = modData)
md3 <- lm(C ~ X, data = modData)
Then do a Shapiro test and a boxcox test to every model, e.g. :
shapiro.test(residuals(md1))
boxcox(md1, plotit = T)
Is there a convenient way to build and test multiple models without manually typing each of them?
Upvotes: 2
Views: 1372
Reputation: 2859
data("mtcars")
formulas <- list(
mpg ~ disp,
mpg ~ disp + wt
)
res <- vector("list", length = length(formulas))
for(i in seq_along(formulas)){
res[[i]] <- lm(formulas[[i]], data = mtcars)}
res
lapply(formulas, lm, data = mtcars)
Upvotes: 1
Reputation: 70633
In case you wouldn't want to introduce dozens of dependencies, you can do it with a simple sapply
. Notice that I don't provide a boxcox
part because I don't know where it comes from (car, MASS?).
modData <- data.frame("A" = rnorm(20, 15, 3),
"B" = rnorm(20, 20, 3),
"C" = rnorm(20, 25, 3),
"X" = rnorm(20, 5, 1))
deps <- c("A", "B", "C")
indeps <- c("X")
result <- sapply(deps, FUN = function(x, indeps, mydata) {
myformula <- formula(sprintf("%s ~ %s", x, indeps))
model <- lm(myformula, data = mydata)
out.shapiro <- shapiro.test(residuals(model))
return(list(model = model, shapiro = out.shapiro))
}, indeps = indeps, mydata = modData, simplify = FALSE)
Upvotes: 2
Reputation: 16121
Here's an alternative approach using tidyverse
:
modData <- data.frame("A" = rnorm(20, 15, 3),
"B" = rnorm(20, 20, 3),
"C" = rnorm(20, 25, 3),
"X" = rnorm(20, 5, 1))
library(tidyverse)
library(broom)
# specify predictor and target variables
x = "X"
y = names(modData)[names(modData)!= x]
expand.grid(y,x) %>% # create combinations
mutate(model_id = row_number(), # create model id
frml = paste0(Var1, "~", Var2)) %>% # create model formula
group_by(model_id, Var1, Var2) %>% # group by the above
nest() %>% # nest data
mutate(m = map(data, ~lm(.$frml, data = modData)), # create models
m_table = map(m, ~tidy(.)), # tidy model output
st = map(m, ~shapiro.test(residuals(.)))) -> dt_model_info # shapiro test
# access model info
dt_model_info
dt_model_info$m
dt_model_info$m_table
dt_model_info$st
# another way to access info
dt_model_info %>% unnest(m_table)
Upvotes: 5
Reputation: 3043
Here is an approach by using simple "lapply":
# 1. Data set
df <- data.frame(
a = rnorm(20, 15, 3),
b = rnorm(20, 20, 3),
c = rnorm(20, 25, 3),
x = rnorm(20, 5, 1))
# 2. Models
fit_lm_a <- lm(a ~ x, df)
fit_lm_b <- lm(b ~ x, df)
fit_lm_c <- lm(c ~ x, df)
# 3. List of models
list_fit_lm <- list(fit_lm_a, fit_lm_b,fit_lm_c)
# 3. Shapiro test
lapply(
list_fit_lm, function(x) {
shapiro.test(residuals(x))
})
# 4. Box-Cox Transformations
lapply(
list_fit_lm, function(x) {
boxcox(x, plotit = TRUE, data = df)
}
)
Upvotes: 3