Serge Rogatch
Serge Rogatch

Reputation: 15040

How to evaluate the quality of PCA returned by torch.pca_lowrank()?

I use the following piece of code:

U, S, V = torch.pca_lowrank(A, q=self.n_components)
self.V = V
self.projection = torch.matmul(A, V)

How to compute the cumulative percent variance or any other accuracy metric (single value between 0 and 100%) based on the above values returned? It's ok to project the matrix back with

approx = torch.matmul(self.projection, self.V.T)

if that helps with computing the metric.

I don't mind using other packages compatible with PyTorch.

Upvotes: 1

Views: 3444

Answers (2)

Moshe
Moshe

Reputation: 1

You can use the singular values - S diagonal.

explained_variances = S**2/(m-1)

where m is the number of data samples - A.shape[0].

The complete code is:

num_samples = A.shape[0]

explained_variance_ = ((S ** 2) / (num_samples - 1)).squeeze()
total_var = torch.sum(explained_variance_)
explained_variance_ratio_ = explained_variance_ / total_var


explained_variance_ratio_cumsum = torch.cumsum(explained_variance_ratio_, dim=0)

Upvotes: 0

Tasos
Tasos

Reputation: 7577

You can calculate the cumulative percent variance as the ratio between the total variance of the reduced dimension matrix and the total variance of the original matrix.

total_var = torch.var(A)

total_var_approx = torch.var(approx)

cumulative_percent_variance = (total_var_approx / total_var) * 100

Upvotes: 1

Related Questions