igm13
igm13

Reputation: 21

How do I apply an lm function over a list of dataframes referencing the columns by index?

I have 73 equally sized dataframes. All of them have 5cols and a 120rows. I have them stored in a list. I want to run an lm function over all of them using lapply. I want to regress the first column over the next 3 adjacent columns for each dataset in the list. I am having trouble figuring out how to reference columns by index in an lapply function.

I can't use names because all of the col names are different. This is what I tried:


my_lms_ESW <- lapply(listDF_ESW, function(x) lm( x[,1] ~ x[,2] 
                                                + x[,3] + x[,4], x))

Upvotes: 2

Views: 111

Answers (2)

lroha
lroha

Reputation: 34291

The default behavior of lm() when passed a data frame is to invoke DF2formula() which regresses all other columns onto the first column, so in this case you can simply do:

lapply(listDF_ESW, \(x) lm(x[-5])

Upvotes: 2

PGSA
PGSA

Reputation: 3071

in your function(x):

lm(
    as.formula(paste(colnames(x)[1], "~",
        paste(colnames(x)[c(2, 3, 4)], collapse = "+"),
        sep = ""
    )),
    data=x
)

This assembles a formula object by pasting the names of the columns along with ~ and +.

Upvotes: 2

Related Questions