Reputation: 409
all
I am trying to fit a linear models for several variables and report all R-squared values.
However, I would like to ask is there a way of doing this in one go rather than doing it in pair ?
for example, I know how to do it with 2 variables as:
data(mtcats)
fit<-lm(formula = mtcars[,1] ~ mtcars[,2])
summary(fit)$r.squared
mtcars has 11 numeric variables, is there a way of dong it for all variables ? I mean, since there are 11 variables, we want to record all r-squared values? We want a 11 by 11 matrix which is symmetric and diagonal of 0s ?
Upvotes: 1
Views: 1892
Reputation: 93851
Because these are single-variable regression models, the r-squared is just the square of the correlation coefficient between each pair of variables, so you can do this:
rsq = cor(mtcars)^2
diag(rsq) = 0 # To get zeros on the diagonals
Here are the first 3 rows and columns:
> rsq[1:3, 1:3]
mpg cyl disp
mpg 0.0000000 0.7261800 0.7183433
cyl 0.7261800 0.0000000 0.8136633
disp 0.7183433 0.8136633 0.0000000
By the way, you might find the corrplot
package useful for visualizing the r-squared values. The package is really intended for correlations, rather than the squares of the correlations, but it's an easy way to quickly get an idea of which pairs of variables have the strongest relationships. You can use a more general heatmap as well, but corrplot
provides some more focused tools for correlations.
library(corrplot)
corrplot.mixed(cor(mtcars)^2)
# Or, to sort the column order by clustering
corrplot.mixed(cor(mtcars)^2, order="hclust")
See the vignette for more info.
Upvotes: 2
Reputation: 887501
You can use outer
res1 <- outer(colnames(mtcars), colnames(mtcars), FUN= function(x,y) {
sapply(as.list(paste(x,y, sep="~")), function(z) {
form1 <- as.formula(z)
fit <- lm(form1, data=mtcars)
summary(fit)$r.squared})
})
or expand.grid
indx <- expand.grid(colnames(mtcars), colnames(mtcars), stringsAsFactors=FALSE)
res2 <- sapply(seq_len(nrow(indx)),function(i) {i1 <- indx[i,]
form1 <-as.formula(paste(i1[,1], i1[,2], sep="~"))
fit <- lm(formula=form1, data=mtcars)
summary(fit)$r.squared})
dim(res2) <- c(11,11)
res2[1:3,1:3]
# [,1] [,2] [,3]
#[1,] 0.0000000 0.7261800 0.7183433
#[2,] 0.7261800 0.0000000 0.8136633
#[3,] 0.7183433 0.8136633 0.0000000
identical(res1,res2)
#[1] TRUE
Upvotes: 2