user836026
user836026

Reputation: 11350

Principal component analysis and feature reductions

I have a matrix composed of 35 features, I need to reduce those feature because I think many variable are dependent. I undertsood PCA could help me to do that, so using matlab, I calculated:

 [coeff,score,latent] = pca(list_of_features)

I notice "coeff" contains matrix which I understood (correct me if I'm wrong) have column with high importance on the left, and second column with less importance and so on. However, it's not clear for me which column on "coeff" relate to which column on my original "list_of_features" so that I could know which variable is more important.

Upvotes: 1

Views: 132

Answers (1)

Itamar Katz
Itamar Katz

Reputation: 9645

PCA doesn't give you an order relation on your original features (which feature is more 'important' then others), rather it gives you directions in feature space, ordered according to the variance, from high variance (1st direction, or principle component) to low variance. A direction is generally a linear combination of your original features, so you can't expect to get information about a single feature.

What you can do is to throw away a direction (one or more), or in other words project you data into the sub-space spanned by a subset of the principle components. Usually you want to throw the directions with low variance, but that's really a choice which depends on what is your application.

Let's say you want to leave only the first k principle components:

x = score(:,1:k) * coeff(:,1:k)';

Note however that pca centers the data, so you actually get the projection of the centered version of your data.

Upvotes: 1

Related Questions