Reputation: 2280
I would like to run a linear regression on panel data. Below my code so far but, I don't understand why is not returning the fit and rsq. Any suggestion?
Sample code:
for(i in names(df))
{
if(is.numeric(df[3,i])) ##if row 3 is numeric, the entire column is
{
fit <- lm(df[3,i] ~ Gender, data=df) #does a regression for each column in my csv file against my independent variable 'etch'
rsq <- summary(fit)$r.squared
}
}
Data structure
Sample data:
df<-structure(list(id = c(1, 1, 2, 2, 2), id1 = c(1, 2, 1, 2, 3),
a1 = c(5, 8, 7, 6, 3), a2 = c(1, 4, 3, 10, 5), a3 = c(2,
34, 3, 12, 6), a4 = c(9, 2, 3, 12, 7), a5 = c(0, 0, 0, 7,
8), a6 = c(7, 7, 0, 0, 9), a7 = c(5, 8, 7, 6, 0), a8 = c(1,
4, 3, 10, 3), a9 = c(2, 34, 3, 12, 3), a10 = c(9, 2, 3, 12,
3), Gender = c(1, 2, 1, 1, 2)), class = c("spec_tbl_df",
"tbl_df", "tbl", "data.frame"), row.names = c(NA, -5L), spec = structure(list(
cols = list(id = structure(list(), class = c("collector_double",
"collector")), id1 = structure(list(), class = c("collector_double",
"collector")), a1 = structure(list(), class = c("collector_double",
"collector")), a2 = structure(list(), class = c("collector_double",
"collector")), a3 = structure(list(), class = c("collector_double",
"collector")), a4 = structure(list(), class = c("collector_double",
"collector")), a5 = structure(list(), class = c("collector_double",
"collector")), a6 = structure(list(), class = c("collector_double",
"collector")), a7 = structure(list(), class = c("collector_double",
"collector")), a8 = structure(list(), class = c("collector_double",
"collector")), a9 = structure(list(), class = c("collector_double",
"collector")), a10 = structure(list(), class = c("collector_double",
"collector")), Gender = structure(list(), class = c("collector_double",
"collector"))), default = structure(list(), class = c("collector_guess",
"collector")), skip = 1L), class = "col_spec"))
Upvotes: 0
Views: 110
Reputation: 388807
To fit a linear regression on each column you can use lapply
. We use reformulate
to create a formula object from the column name and use it in lapply
. R-squared value can be extracted from summary
of each model.
cols <- grep('a\\d+', names(df), value = TRUE)
cols
#[1] "a1" "a2" "a3" "a4" "a5" "a6" "a7" "a8" "a9" "a10"
lapply(cols, function(x) {
lm(reformulate('Gender', x), df)
}) -> fit
r.squared <- sapply(fit, function(x) summary(x)$r.squared)
Upvotes: 1