Reputation: 11
I am doing PCA using temperature data (below), samples in row and features(1000 hPa, 925 hPa... etc) in columns
array([[ 25. , 22.2, 19. , ..., -51.9, -50.3, -41.1],
[ 26.8, 22.8, 18.4, ..., -53.1, -49.5, -41.1],
[ 26.4, 23.4, 19.4, ..., -56.7, -49.7, -41.3],
...,
[ 9.4, 6.8, 3.2, ..., -57.7, -55.9, -57.9],
[ 12.4, 7.4, 3.8, ..., -53.5, -53.9, -56.1],
[ 9.6, 5.8, 4.2, ..., -54.9, -53.1, -50.9]])
I ran PCA.
pca = PCA(n_components=2)
proj = pca.fit_transform(data)
inversed_data = pca.inverse_transform(proj)
(Here, inversed data is estimated values (PC1 + PC2). right?)
I seperated estimated values into PC1 and PC2 using pca.components_.
pca.components_
array([[-0.33776309, -0.34230437, -0.33367396, -0.32389647, -0.36274215,
-0.37980682, -0.33324365, -0.21884887, -0.02131457, 0.16129112,
0.24344067, 0.15305721, 0.08841673, 0.0262782 , 0.00574684,
0.00390428],
[-0.18303616, -0.29623333, -0.32912031, -0.17544341, -0.08903607,
0.04295601, 0.37370419, 0.55664452, 0.40733697, 0.0431838 ,
-0.21696205, -0.20124614, -0.14519851, -0.05066843, -0.01942078,
0.031218 ]])
But I have trouble now. I want to compare pca.components_ with original data. To do this, I have to inverse pca.components_ but I can't. Do you have any idea?
I did :
pca.inverse_transform(pca.components_)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-34-a389a4196f5f> in <module>()
----> 1 pca.inverse_transform(pca.components_[0])
/usr/local/lib/python3.7/dist-packages/sklearn/decomposition/_base.py in inverse_transform(self, X)
157 self.components_) + self.mean_
158 else:
--> 159 return np.dot(X, self.components_) + self.mean_
<__array_function__ internals> in dot(*args, **kwargs)
ValueError: shapes (16,) and (2,16) not aligned: 16 (dim 0) != 2 (dim 0)
But it didn't work. Or can I use sklearn.preprocessing.StandardScaler.inverse_transform() to see inversed pca.components_? Actually it did work. but I don't know it is right or wrong.
Thank you
Upvotes: 1
Views: 2063
Reputation: 1
Set the other principal components to 0, in this way your will still have the projection matrix that translates from the PCA space to your feature space:
pca = PCA(n_components=data.shape[1])
pca.fit(data)
pca.singular_values_[2:]=pca.singular_values_[2:]*0
print(pca.singular_values_) #All but first 2 singular values set to 0
proj = pca.transform(data)
filtered_data = pca.inverse_transform(proj)
Upvotes: 0
Reputation: 5365
When you do PCA and set n_components
<n_features
you will lose information, thus you cannot get the exact same data when you transform back, (see this SO answer).
You can think of it as having a picture that's 1024x1024, you then scale it down to 784x784 and then want to scale it back to 1024x1024 - that cannot be done 1:1. You can still see the image, but it might be a bit blurry
Upvotes: 2