Wilks
Wilks

Reputation: 1

Don't know why eigen() gives a vectors of wrong sign and the loading matrix is just vector

Don't know why eigen() gives a vectors of wrong sign and the loading matrix is just vector

setwd("D:/BlueHDD/MAQAB/RStudio/R/PCA/Intelligence")

mydata <- read.csv("Intelligence2.csv",na.strings = ".")

head(mydata)

      M     P     C     E     H     F
1 1.000 0.620 0.540 0.320 0.284 0.370
2 0.620 1.000 0.510 0.380 0.351 0.430
3 0.540 0.510 1.000 0.360 0.336 0.405
4 0.320 0.380 0.360 1.000 0.686 0.730
5 0.284 0.351 0.336 0.686 1.000 0.735
6 0.370 0.430 0.405 0.730 0.735 1.000

ii <- as.matrix(mydata[,1:6])

rownames(ii)<- c ("M","P","C","E","H","F")

colnames(ii)<- c ("M","P","C","E","H","F")

head(ii)

      M     P     C     E     H     F
M 1.000 0.620 0.540 0.320 0.284 0.370
P 0.620 1.000 0.510 0.380 0.351 0.430
C 0.540 0.510 1.000 0.360 0.336 0.405
E 0.320 0.380 0.360 1.000 0.686 0.730
H 0.284 0.351 0.336 0.686 1.000 0.735
F 0.370 0.430 0.405 0.730 0.735 1.000

myEIG <- eigen(ii)

myEIG$values

[1] 3.3670861 1.1941791 0.5070061 0.3718472 0.3131559 0.2467257

myEIG$vectors

           [,1]       [,2]         [,3]        [,4]         [,5]
[1,] -0.3677678 -0.5098401  0.266985551  0.72768020  0.047584025
[2,] -0.3913477 -0.4092063  0.485916591 -0.66464527 -0.005392018
[3,] -0.3719504 -0.3825819 -0.831626240 -0.15204371 -0.003331423
[4,] -0.4321872  0.3748248  0.021531885  0.06531777 -0.742970281
[5,] -0.4219572  0.4214599  0.002730054  0.01174474  0.665109730
[6,] -0.4565228  0.3288196  0.023032686  0.03473540  0.057617669
            [,6]
[1,] -0.04178482
[2,] -0.03872816
[3,] -0.02352388
[4,] -0.34056682
[5,] -0.44922966
[6,]  0.82365511

myPCA <- princomp(covmat=ii)

head(myPCA)

$sdev

   Comp.1    Comp.2    Comp.3    Comp.4    Comp.5    Comp.6 
1.8349621 1.0927850 0.7120436 0.6097927 0.5596033 0.4967149 

$loadings

Loadings:

  Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6
M  0.368  0.510  0.267  0.728              
P  0.391  0.409  0.486 -0.665              
C  0.372  0.383 -0.832 -0.152              
E  0.432 -0.375               -0.743  0.341
H  0.422 -0.421                0.665  0.449
F  0.457 -0.329                      -0.824


               Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6
SS loadings     1.000  1.000  1.000  1.000  1.000  1.000
Proportion Var  0.167  0.167  0.167  0.167  0.167  0.167
Cumulative Var  0.167  0.333  0.500  0.667  0.833  1.000

$center [1] NA NA NA NA NA NA

$scale

M P C E H F 
1 1 1 1 1 1 

$n.obs [1] NA

$scores NULL

summary(myPCA) # print variance accounted for Importance of components:

                         Comp.1    Comp.2     Comp.3     Comp.4
Standard deviation     1.834962 1.0927850 0.71204360 0.60979272
Proportion of Variance 0.561181 0.1990299 0.08450101 0.06197453
Cumulative Proportion  0.561181 0.7602109 0.84471188 0.90668641

                           Comp.5     Comp.6
Standard deviation     0.55960331 0.49671489
Proportion of Variance 0.05219264 0.04112095
Cumulative Proportion  0.95887905 1.00000000

loadings(myPCA) # pc loadings

Loadings:

  Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6
M  0.368  0.510  0.267  0.728              
P  0.391  0.409  0.486 -0.665              
C  0.372  0.383 -0.832 -0.152              
E  0.432 -0.375               -0.743  0.341
H  0.422 -0.421                0.665  0.449
F  0.457 -0.329                      -0.824


               Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6
SS loadings     1.000  1.000  1.000  1.000  1.000  1.000
Proportion Var  0.167  0.167  0.167  0.167  0.167  0.167
Cumulative Var  0.167  0.333  0.500  0.667  0.833  1.000

plot(myPCA,type="lines") # scree plot

Would be very grateful for kind help!

Upvotes: 0

Views: 543

Answers (1)

duckmayr
duckmayr

Reputation: 16930

If the operation of an R function or its return value is confusion, always check the help file first.

help("princomp")

There we see (among other things), the following:

Value:
    ‘princomp’ returns a list with class ‘"princomp"’ containing the following components:
     ...
     loadings: the matrix of variable loadings (i.e., a matrix whose columns
     contain the eigenvectors)....
...
Note:
     The signs of the columns of the loadings and scores are arbitrary,
     and so may differ between different programs for PCA, and even
     between different builds of R: ‘fix_sign = TRUE’ alleviates that.

An explanation of why the signs are arbitrary can be found here, or indeed is readily apparent from the definition of eigenvalues and eigenvectors (see, e.g., here).

As to why the loadings element just gives the eigenvectors, it's because the documentation says that's what it will return! However, as to whether you should use eigenvectors or eigenvectors multiplied by the square root of the eigenvalues, or whether you can call both or only the latter "loadings", there has historically been some back and forth or confusion on this (see this Cross Validated question and its answers for some discussion on this).

Ultimately, you just have to know which your chosen function returns, and either manually adjust it, or change how you discuss what it means, or choose a different function that returns what you're looking for.

Upvotes: 1

Related Questions