Reputation: 1
Don't know why eigen() gives a vectors of wrong sign and the loading matrix is just vector
setwd("D:/BlueHDD/MAQAB/RStudio/R/PCA/Intelligence")
mydata <- read.csv("Intelligence2.csv",na.strings = ".")
head(mydata)
M P C E H F
1 1.000 0.620 0.540 0.320 0.284 0.370
2 0.620 1.000 0.510 0.380 0.351 0.430
3 0.540 0.510 1.000 0.360 0.336 0.405
4 0.320 0.380 0.360 1.000 0.686 0.730
5 0.284 0.351 0.336 0.686 1.000 0.735
6 0.370 0.430 0.405 0.730 0.735 1.000
ii <- as.matrix(mydata[,1:6])
rownames(ii)<- c ("M","P","C","E","H","F")
colnames(ii)<- c ("M","P","C","E","H","F")
head(ii)
M P C E H F
M 1.000 0.620 0.540 0.320 0.284 0.370
P 0.620 1.000 0.510 0.380 0.351 0.430
C 0.540 0.510 1.000 0.360 0.336 0.405
E 0.320 0.380 0.360 1.000 0.686 0.730
H 0.284 0.351 0.336 0.686 1.000 0.735
F 0.370 0.430 0.405 0.730 0.735 1.000
myEIG <- eigen(ii)
myEIG$values
[1] 3.3670861 1.1941791 0.5070061 0.3718472 0.3131559 0.2467257
myEIG$vectors
[,1] [,2] [,3] [,4] [,5]
[1,] -0.3677678 -0.5098401 0.266985551 0.72768020 0.047584025
[2,] -0.3913477 -0.4092063 0.485916591 -0.66464527 -0.005392018
[3,] -0.3719504 -0.3825819 -0.831626240 -0.15204371 -0.003331423
[4,] -0.4321872 0.3748248 0.021531885 0.06531777 -0.742970281
[5,] -0.4219572 0.4214599 0.002730054 0.01174474 0.665109730
[6,] -0.4565228 0.3288196 0.023032686 0.03473540 0.057617669
[,6]
[1,] -0.04178482
[2,] -0.03872816
[3,] -0.02352388
[4,] -0.34056682
[5,] -0.44922966
[6,] 0.82365511
myPCA <- princomp(covmat=ii)
head(myPCA)
$sdev
Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6
1.8349621 1.0927850 0.7120436 0.6097927 0.5596033 0.4967149
$loadings
Loadings:
Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6
M 0.368 0.510 0.267 0.728
P 0.391 0.409 0.486 -0.665
C 0.372 0.383 -0.832 -0.152
E 0.432 -0.375 -0.743 0.341
H 0.422 -0.421 0.665 0.449
F 0.457 -0.329 -0.824
Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6
SS loadings 1.000 1.000 1.000 1.000 1.000 1.000
Proportion Var 0.167 0.167 0.167 0.167 0.167 0.167
Cumulative Var 0.167 0.333 0.500 0.667 0.833 1.000
$center [1] NA NA NA NA NA NA
$scale
M P C E H F
1 1 1 1 1 1
$n.obs [1] NA
$scores NULL
summary(myPCA) # print variance accounted for Importance of components:
Comp.1 Comp.2 Comp.3 Comp.4
Standard deviation 1.834962 1.0927850 0.71204360 0.60979272
Proportion of Variance 0.561181 0.1990299 0.08450101 0.06197453
Cumulative Proportion 0.561181 0.7602109 0.84471188 0.90668641
Comp.5 Comp.6
Standard deviation 0.55960331 0.49671489
Proportion of Variance 0.05219264 0.04112095
Cumulative Proportion 0.95887905 1.00000000
loadings(myPCA) # pc loadings
Loadings:
Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6
M 0.368 0.510 0.267 0.728
P 0.391 0.409 0.486 -0.665
C 0.372 0.383 -0.832 -0.152
E 0.432 -0.375 -0.743 0.341
H 0.422 -0.421 0.665 0.449
F 0.457 -0.329 -0.824
Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6
SS loadings 1.000 1.000 1.000 1.000 1.000 1.000
Proportion Var 0.167 0.167 0.167 0.167 0.167 0.167
Cumulative Var 0.167 0.333 0.500 0.667 0.833 1.000
plot(myPCA,type="lines") # scree plot
Would be very grateful for kind help!
Upvotes: 0
Views: 543
Reputation: 16930
If the operation of an R function or its return value is confusion, always check the help file first.
help("princomp")
There we see (among other things), the following:
Value:
‘princomp’ returns a list with class ‘"princomp"’ containing the following components:
...
loadings: the matrix of variable loadings (i.e., a matrix whose columns
contain the eigenvectors)....
...
Note:
The signs of the columns of the loadings and scores are arbitrary,
and so may differ between different programs for PCA, and even
between different builds of R: ‘fix_sign = TRUE’ alleviates that.
An explanation of why the signs are arbitrary can be found here, or indeed is readily apparent from the definition of eigenvalues and eigenvectors (see, e.g., here).
As to why the loadings
element just gives the eigenvectors, it's because the documentation says that's what it will return! However, as to whether you should use eigenvectors or eigenvectors multiplied by the square root of the eigenvalues, or whether you can call both or only the latter "loadings", there has historically been some back and forth or confusion on this (see this Cross Validated question and its answers for some discussion on this).
Ultimately, you just have to know which your chosen function returns, and either manually adjust it, or change how you discuss what it means, or choose a different function that returns what you're looking for.
Upvotes: 1