Reputation: 11

How to Inverse pca.components_ to original values

I am doing PCA using temperature data (below), samples in row and features(1000 hPa, 925 hPa... etc) in columns

array([[ 25. ,  22.2,  19. , ..., -51.9, -50.3, -41.1],
       [ 26.8,  22.8,  18.4, ..., -53.1, -49.5, -41.1],
       [ 26.4,  23.4,  19.4, ..., -56.7, -49.7, -41.3],
       ...,
       [  9.4,   6.8,   3.2, ..., -57.7, -55.9, -57.9],
       [ 12.4,   7.4,   3.8, ..., -53.5, -53.9, -56.1],
       [  9.6,   5.8,   4.2, ..., -54.9, -53.1, -50.9]])

I ran PCA.

pca = PCA(n_components=2)
proj = pca.fit_transform(data)
inversed_data = pca.inverse_transform(proj)

(Here, inversed data is estimated values (PC1 + PC2). right?)

I seperated estimated values into PC1 and PC2 using pca.components_.

pca.components_

array([[-0.33776309, -0.34230437, -0.33367396, -0.32389647, -0.36274215,
        -0.37980682, -0.33324365, -0.21884887, -0.02131457,  0.16129112,
         0.24344067,  0.15305721,  0.08841673,  0.0262782 ,  0.00574684,
         0.00390428],
       [-0.18303616, -0.29623333, -0.32912031, -0.17544341, -0.08903607,
         0.04295601,  0.37370419,  0.55664452,  0.40733697,  0.0431838 ,
        -0.21696205, -0.20124614, -0.14519851, -0.05066843, -0.01942078,
         0.031218  ]])

But I have trouble now. I want to compare pca.components_ with original data. To do this, I have to inverse pca.components_ but I can't. Do you have any idea?
I did :

pca.inverse_transform(pca.components_)

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-34-a389a4196f5f> in <module>()
----> 1 pca.inverse_transform(pca.components_[0])

/usr/local/lib/python3.7/dist-packages/sklearn/decomposition/_base.py in inverse_transform(self, X)
    157                             self.components_) + self.mean_
    158         else:
--> 159             return np.dot(X, self.components_) + self.mean_

<__array_function__ internals> in dot(*args, **kwargs)

ValueError: shapes (16,) and (2,16) not aligned: 16 (dim 0) != 2 (dim 0)

But it didn't work. Or can I use sklearn.preprocessing.StandardScaler.inverse_transform() to see inversed pca.components_? Actually it did work. but I don't know it is right or wrong.

Thank you

Upvotes: 1

Answers (2)

Ikaro Silva

Reputation: 1

Set the other principal components to 0, in this way your will still have the projection matrix that translates from the PCA space to your feature space:

pca = PCA(n_components=data.shape[1])
pca.fit(data)
pca.singular_values_[2:]=pca.singular_values_[2:]*0

print(pca.singular_values_) #All but first 2 singular values set to 0
proj = pca.transform(data)
filtered_data = pca.inverse_transform(proj)

Upvotes: 0

CutePoison

Reputation: 5365

When you do PCA and set n_components<n_features you will lose information, thus you cannot get the exact same data when you transform back, (see this SO answer).

You can think of it as having a picture that's 1024x1024, you then scale it down to 784x784 and then want to scale it back to 1024x1024 - that cannot be done 1:1. You can still see the image, but it might be a bit blurry

Upvotes: 2

How to Inverse pca.components_ to original values

Answers (2)

Related Questions