Reputation: 198
I was trying Principal Component Analysis in the following data set. I tried through prcomp function and caret preProcess function.
library(caret)
library(AppliedPredictiveModeling)
set.seed(3433)
data(AlzheimerDisease)
adData = data.frame(diagnosis,predictors)
inTrain = createDataPartition(adData$diagnosis, p = 3/4)[[1]]
training = adData[ inTrain,]
testing = adData[-inTrain,]
# from prcomp
names1 <-names(training)[substr(names(training),1,2)=="IL"]
prcomp.data <- prcomp(training[,names1],center=TRUE, scale=TRUE)
prcomp.data$sdev
## from caret package
preProcess(training[, names1], method=c("center", "scale", "pca"))$std
I was wondering why sdev values differ in the above processes. Thanks
Upvotes: 3
Views: 784
Reputation: 23200
The first method is giving you standard deviations of 12 principal components (which you can see with prcomp.data$rotation
).
Also, this is mentioned in the documentation for the sdev
value:
the standard deviations of the principal components (i.e., the square roots of the eigenvalues of the covariance/correlation matrix, though the calculation is actually done with the singular values of the data matrix).
The 2nd is giving you standard deviations on the pre-processed input data (hence the variable names associated with each standard deviation).
A small side note -- caret
PCA's are automatically scaled and centered unless otherwise specified.
Upvotes: 1