Reputation: 2439
Imagine I have the following feature vectors:
Training vectors:
Class 1: [ 3, 5, 4, 2, 0, 3, 2], [ 33, 50, 44, 22, 0, 33, 20]
Class 2: [ 1, 2, 3, 1, 0, 0, 4], [ 11, 22, 33, 11, 0, 0, 44]
Testing vectors:
Class 1: [ 330, 550, 440, 220, 0, 330, 200]
Class 2: [ 110, 220, 333, 111, 0, 0, 444]
I am using SVM, which learns from the training vectors and then classifies the test samples.
As you can see the feature vectors have very different dimensions: the training set features are very low value numbers and the test set vectors are very high value numbers.
My question is whether it is confusing for SVM to learn from such feature vectors?
Of course when I do vector scaling the difference is still there:
for example after applying standardScaler() on the feature vectors for Class 1:
Training:
[ 0.19 1.53 0.86 -0.48 -1.82 0.19 -0.48]
[ 20.39 31.85 27.80 12.99 -1.82 20.39 11.64]
Test: [ 220.45 368.63 294.54 146.35 -1.82 220.45 132.88]
Basically, this is a real world problem, and I am asking this since I have developed a way to pre-scale those feature vectors for my particular case.
So after I would use my pre-scaling method, the feature vectors for Class 1 would become:
Training:
[ 3. 5. 4. 2. 0. 3. 2.]
[ 2.75 4.16666667 3.66666667 1.83333333 0. 2.75 1.66666667]
Test: [ 2.84482759 4.74137931 3.79310345 1.89655172 0. 2.84482759 1.72413793]
which makes them very similar in nature.
This looks even better when standardScaler() is applied onto the pre-scaled vectors:
Training:
[ 0.6 1. 0.8 0.4 0. 0.6 0.4]
[ 0.55 0.83333333 0.73333333 0.36666667 0. 0.55 0.33333333]
Test: [ 0.56896552 0.94827586 0.75862069 0.37931034 0. 0.56896552 0.34482759]
The ultimate question is whether my pre-scaling method is going to help the SVM in any way? This is more of a theoretical question, any insight into this is appreciated.
Upvotes: 0
Views: 71
Reputation: 7543
Yes, it will affect the performance of the SVM. It seems your test vectors are just scaled versions of your training vectors. The SVM has no way of knowing that the scaling is irrelevant in your case (unless you present it alot of differently scaled training vectors)
A common practice for feature vectors where the scaling is irrelevant is to scale all the test and train vectors to a common length.
Upvotes: 3