Reputation: 181
I am looking for a way to calculate the multiple correlation coefficient in R http://en.wikipedia.org/wiki/Multiple_correlation, is there a built-in function to calculate it ? I have one dependent variable and three independent ones. I am not able to find it online, any idea ?
Upvotes: 5
Views: 24690
Reputation: 1633
The easiest way to calculate what you seem to be asking for when you refer to 'the multiple correlation coefficient' (i.e. the correlation between two or more independent variables on the one hand, and one dependent variable on the other) is to create a multiple linear regression (predicting the values of one variable treated as dependent from the values of two or more variables treated as independent) and then calculate the coefficient of correlation between the predicted and observed values of the dependent variable.
Here, for example, we create a linear model called mpg.model
, with mpg
as the dependent variable and wt
and cyl
as the independent variables, using the built-in mtcars
dataset:
> mpg.model <- lm(mpg ~ wt + cyl, data = mtcars)
Having created the above model, we correlate the observed values of mpg
(which are embedded in the object, within the model
data frame) with the predicted values for the same variable (also embedded):
> cor(mpg.model$model$mpg, mpg.model$fitted.values)
[1] 0.9111681
R will in fact do this calculation for you, but without telling you so, when you ask it to create the summary of a model (as in Brian's answer): the summary of an lm
object contains R-squared, which is the square of the coefficient of correlation between the predicted and observed values of the dependent variable. So an alternative way to get the same result is to extract R-squared from the summary.lm
object and take the square root of it, thus:
> sqrt(summary(mpg.model)$r.squared)
[1] 0.9111681
I feel that I should point out, however, that the term 'multiple correlation coefficient' is ambiguous.
Upvotes: 7
Reputation: 3805
Try this:
# load sample data
data(mtcars)
# calculate correlation coefficient between all variables in `mtcars` using
# the inbulit function
M <- cor(mtcars)
# M is a matrix of correlation coefficient which you can display just by
# running
print(M)
# If you want to plot the correlation coefficient
library(corrplot)
corrplot(M, method="number",type= "lower",insig = "blank", number.cex = 0.6)
Upvotes: -3
Reputation: 6213
The built-in function lm
gives at least one version, not sure if this is what you are looking for:
fit <- lm(yield ~ N + P + K, data = npk)
summary(fit)
Gives:
Call:
lm(formula = yield ~ N + P + K, data = npk)
Residuals:
Min 1Q Median 3Q Max
-9.2667 -3.6542 0.7083 3.4792 9.3333
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 54.650 2.205 24.784 <2e-16 ***
N1 5.617 2.205 2.547 0.0192 *
P1 -1.183 2.205 -0.537 0.5974
K1 -3.983 2.205 -1.806 0.0859 .
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 5.401 on 20 degrees of freedom
Multiple R-squared: 0.3342, Adjusted R-squared: 0.2343
F-statistic: 3.346 on 3 and 20 DF, p-value: 0.0397
More info on what's going on at ?summary.lm
and ?lm
.
Upvotes: 6