anrisakaki96
anrisakaki96

Reputation: 313

How to run the same regression but replacing the dataframe used in R?

I have 3 dataframes (df1, df2, df3) with the same variable names, and I would like to perform essentially the same regressions on all 3 dataframes. My regressions currently look like this:

m1 <- lm(y ~ x1 + x2, df1)

m2 <- lm(y~ x1 + x2, df2)

m3<- lm(y~ x1 + x2, df3)

Is there a way I can use for-loops in order to perform these regressions by just swapping out dataframe used?

Thank you

Upvotes: 1

Views: 28

Answers (2)

jay.sf
jay.sf

Reputation: 72893

Using update.

(fit <- lm(Y ~ X1 + X2 + X3, df1))
# Call:
# lm(formula = Y ~ X1 + X2 + X3, data = df1)
# 
# Coefficients:
# (Intercept)           X1           X2           X3  
#      0.9416      -0.2400       0.6481       0.9357  

update(fit, data=df2)
# Call:
# lm(formula = Y ~ X1 + X2 + X3, data = df2)
# 
# Coefficients:
#   (Intercept)           X1           X2           X3  
#        0.6948       0.3199       0.6255       0.9588  

Or lapply

lapply(mget(ls(pattern='^df\\d$')), lm, formula=Y ~ X1 + X2 + X3)
# $df1
# 
# Call:
#   FUN(formula = ..1, data = X[[i]])
# 
# Coefficients:
#   (Intercept)           X1           X2           X3  
#        0.9416      -0.2400       0.6481       0.9357  
# 
# 
# $df2
# 
# Call:
#   FUN(formula = ..1, data = X[[i]])
# 
# Coefficients:
#   (Intercept)           X1           X2           X3  
#        0.6948       0.3199       0.6255       0.9588  
# 
# 
# $df3
# 
# Call:
#   FUN(formula = ..1, data = X[[i]])
# 
# Coefficients:
#   (Intercept)           X1           X2           X3  
#        0.5720       0.6106      -0.1576       1.1391  

Data:

set.seed(42)
f <- \() transform(data.frame(X1=rnorm(10), X2=rnorm(10), X3=rnorm(10)), 
                   Y=1 + .2*X1 + .4*X2 + .8*X3 + rnorm(10))
set.seed(42); df1 <- f(); df2 <- f()

Upvotes: 2

mkpt_uk
mkpt_uk

Reputation: 256

or add the dataframes to a list and map the lm function over the list.

library(tidyverse)

df1 <- tibble(x = 1:20, y = 3*x + rnorm(20, sd = 5))
df2 <- tibble(x = 1:20, y = 3*x + rnorm(20, sd = 5))
df3 <- tibble(x = 1:20, y = 3*x + rnorm(20, sd = 5))

df_list <- list(df1, df2, df3)

m <- map(df_list, ~lm(y ~ x, data = .))

Upvotes: 2

Related Questions