Reputation: 11
I am looking to use my PC1 from a PCA in a hierarchical regression analysis to account for additional variation in R. Is this possible?
I ran my pca with the code below in R
pca<- prcomp(my.data[,c(57:62)], center = TRUE,scale. = TRUE)
summary(pca)
str(pca)
fviz_eig(pca)
fviz_pca_ind(pca,
col.ind = "cos2", # Color by the quality of representation
gradient.cols = c("#00AFBB", "#E7B800", "#FC4E07"),
repel = TRUE # Avoid text overlapping
)
ggbiplot(pca)
print(pca)
#some results!
Rotation (n x k) = (4 x 4):
PC1 PC2 PC3
EC 0.5389823 -0.4785188 0.0003197419
temp 0.4787782 0.3590390 0.7913858440
pH 0.5495125 -0.3839466 -0.2673991595
DO. 0.4222624 0.7033461 -0.5497326925
PC4
EC 0.6931938
temp -0.1247834
pH -0.6921840
DO. 0.1574569
Now I hope to use the PC1 as a variable in my models
Somthing like this m0<- lm(Rel.abund.rotifers~turb+chl.a+PC1,data=my.data)
Any help is very appreciated!
Upvotes: 1
Views: 90
Reputation: 17069
Extract the component scores using pca$x
, add them to your dataframe using cbind()
, then run your model. Example using mtcars
:
pca <- prcomp(mtcars[, 3:6])
mtcars2 <- cbind(mtcars, pca$x)
m0 <- lm(mpg ~ cyl + PC1, data = mtcars2)
summary(m0)
Call:
lm(formula = mpg ~ cyl + PC1, data = mtcars2)
Residuals:
Min 1Q Median 3Q Max
-4.1424 -2.0289 -0.7483 1.3613 6.9373
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 27.99508 4.80433 5.827 2.56e-06 ***
cyl -1.27749 0.77169 -1.655 0.1086
PC1 -0.02275 0.01010 -2.251 0.0321 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 3.008 on 29 degrees of freedom
Multiple R-squared: 0.7669, Adjusted R-squared: 0.7508
F-statistic: 47.71 on 2 and 29 DF, p-value: 6.742e-10
Upvotes: 1