Reputation: 527
I'm trying find a way to retrieve the PC scores as obtained after rotating by the PCA components found by ml_pca().
The PCA components are easily accessible using $components, but the result of matrix multiplication of the input data by the PCA components doesn't seem to be accessible.
I can do it "manually". For instance, in Scala it would be:
val mat: RowMatrix = new RowMatrix(dataRDD)
// Compute the top 4 principal components.
// Principal components are stored in a local dense matrix.
val pc: Matrix = mat.computePrincipalComponents(4)
// Project the rows to the linear space spanned by the top 4 principal components.
val projected: RowMatrix = mat.multiply(pc)
The desired output is the "projected" but I want to find a way to retrieve this information straight from the object.
Upvotes: 1
Views: 319
Reputation: 21
You can try something like this.
pca_model <- Data %>%
ml_pca(features = c("feature1", "feature2"), k = 2) %>%
sdf_project() %>%
select(response_variable,starts_with("PC"))
Upvotes: 1