Reputation: 1853
I am trying to do PCA for dimension reduction in WEKA (Classification Problem).
I have 200 attributes in my data and close to 2100 rows.
Here are the steps that i follow
Import csv file in WEKA explorer
In preprocess tab, apply, Normalize data (To bring entire data in range of [0,1]
Then implement PCA.
My doubt is
What is the option that i should select in PCA WEKA for centerData option in either case?
Upvotes: 1
Views: 5232
Reputation: 2811
This question has been answered in part here: PCA first or normalization first?
To answer your questions directly:
Normalizing would be a personal choice. If you set centerData=TRUE, and do not normalize or standardize your data, some attributes with large values will have greater influence in the PCA. If you set centerData=FALSE, Weka standardizes the data for you.
And just to confirm your suspicions, in Weka, centerData does the following:
centerData=TRUE
centerData=FALSE
Upvotes: 7