Reputation:
My actual vector has 110 elements that I'll use to extract features from images in matlab, I took this one (tb
) to simplify
tb=[22.9 30.0 30.3 27.8 24.1 28.2 26.4 12.6 39.7 38.0];
normalized_V = tb/norm(tb);
I = mat2gray(tb);
For normalized_v
I got 0.2503 0.3280 0.3312 0.3039 0.2635 0.3083 0.2886 0.1377 0.4340 0.4154
.
For I
I got 0.3801 0.6421 0.6531 0.5609 0.4244 0.5756 0.5092 0 1.0000 0.9373
which one should I use if any of those 2 methods and why, and should I transform the features vector to 1 element after extraction for better training or leave it as a 110 element vector.
Upvotes: 0
Views: 654
Reputation: 874
Normalization can be performed in several ways, such as the following:
(tb-min(tb))/max(tb)
tb/max(tb)
(which is the method that you have been used before).zscore(tb)
(or (tb-mean(tb))/std(tb)
).So, your final values would be:
zscore(tb)
ans =
-0.6664
0.2613
0.3005
-0.0261
-0.5096
0.0261
-0.2091
-2.0121
1.5287
1.3066
Edit:
In regard to your second question, it depends on the number of observations. Every single classifier takes an MxN
matrix of data and an Mx1
vector of labels as inputs. In this case, M
refers to the number of observations, whereas N
refers to the number of features. Usually, in order to avoid over-fitting, it is recommended to use a number of features less than the tenth part of the number of observations (i.e., the number of observations must be M > 10N
).
So, in your case, if you use the entire 110-set of features, you should have a minimum of 1100 observations, otherwise you can have problems with over-fitting.
Upvotes: 1