Vasilis Vasileiou
Vasilis Vasileiou

Reputation: 527

Provide rotated data (Principal component scores) after PCA in Sparklyr

I'm trying find a way to retrieve the PC scores as obtained after rotating by the PCA components found by ml_pca().

The PCA components are easily accessible using $components, but the result of matrix multiplication of the input data by the PCA components doesn't seem to be accessible.

I can do it "manually". For instance, in Scala it would be:

val mat: RowMatrix = new RowMatrix(dataRDD)

// Compute the top 4 principal components.
// Principal components are stored in a local dense matrix.
val pc: Matrix = mat.computePrincipalComponents(4)

// Project the rows to the linear space spanned by the top 4 principal components.
val projected: RowMatrix = mat.multiply(pc)

The desired output is the "projected" but I want to find a way to retrieve this information straight from the object.

Upvotes: 1

Views: 319

Answers (1)

Aakash_S
Aakash_S

Reputation: 21

You can try something like this.

    pca_model <- Data %>%
       ml_pca(features = c("feature1", "feature2"), k = 2) %>%
       sdf_project() %>%
       select(response_variable,starts_with("PC"))

Upvotes: 1

Related Questions