Harry
Harry

Reputation: 198

std values of principal component object differs in prcomp and caret

I was trying Principal Component Analysis in the following data set. I tried through prcomp function and caret preProcess function.

  library(caret)
  library(AppliedPredictiveModeling)

  set.seed(3433)
  data(AlzheimerDisease)
  adData = data.frame(diagnosis,predictors)
  inTrain = createDataPartition(adData$diagnosis, p = 3/4)[[1]]
  training = adData[ inTrain,]
  testing = adData[-inTrain,]

  # from prcomp
  names1 <-names(training)[substr(names(training),1,2)=="IL"]
  prcomp.data <- prcomp(training[,names1],center=TRUE, scale=TRUE)
  prcomp.data$sdev

  ## from caret package
  preProcess(training[, names1], method=c("center", "scale", "pca"))$std

I was wondering why sdev values differ in the above processes. Thanks

Upvotes: 3

Views: 784

Answers (1)

Hack-R
Hack-R

Reputation: 23200

The first method is giving you standard deviations of 12 principal components (which you can see with prcomp.data$rotation).

Also, this is mentioned in the documentation for the sdev value:

the standard deviations of the principal components (i.e., the square roots of the eigenvalues of the covariance/correlation matrix, though the calculation is actually done with the singular values of the data matrix).

The 2nd is giving you standard deviations on the pre-processed input data (hence the variable names associated with each standard deviation).

A small side note -- caret PCA's are automatically scaled and centered unless otherwise specified.

Upvotes: 1

Related Questions