Reputation: 2493
I am calculating the PCA for the iris dataset as follows:
data(iris)
ir.pca <- prcomp(iris[, 1:4], center = TRUE, scale. = TRUE)
This is the first row of the iris dataset:
head(iris, 1)
#Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#1 5.1 3.5 1.4 0.2 setosa
For the first row, I can see that the value of the first principal component is -2.257141:
head(ir.pca$x, 1)
# PC1 PC2 PC3 PC4
#[1,] -2.257141 -0.4784238 0.1272796 0.02408751
But when I try extract the loadings:
ir.pca$rotation[, 1]
Sepal.Length Sepal.Width Petal.Length Petal.Width
0.5210659 -0.2693474 0.5804131 0.5648565
and calculate the first principal component myself:
0.5210659 * 5.1 + -0.2693474 * 3.5 + 0.5804131 * 1.4 + 0.5648565 * 0.2
I get a different result of 2.64027.
Why is that?
Upvotes: 4
Views: 111
Reputation: 8846
Scaling is the issue.
Either drop scaling in the prcomp()
call
data(iris)
ir.pca <- prcomp(iris[, 1:4], center = FALSE, scale. = FALSE)
head(ir.pca$x, 1)
# PC1 PC2 PC3 PC4
# [1,] -5.912747 2.302033 0.007401536 0.003087706
ir.pca$rotation[, 1] %*% t(iris[1, 1:4])
# 1
# [1,] -5.912747
Or scale iris
before you manually apply the loadings
ir.pca <- prcomp(iris[, 1:4], center = TRUE, scale. = TRUE)
head(ir.pca$x, 1)
# PC1 PC2 PC3 PC4
# [1,] -2.257141 -0.4784238 0.1272796 0.02408751
ir.pca$rotation[, 1] %*% scale(iris[, 1:4])[1,]
# [,1]
# [1,] -2.257141
Upvotes: 4