Reputation: 419
I'm going on with my project of OCR using MS Visual Studio 2008, OpenCV, C++ and SVM. I've generated a dataset of > 2000 samples of machine-printed characters. When I test with linear kernel, I always get 96,36% accuracy rate.
How I use SVM in OpenCV can be referred in this thread.
Now I try to use RBF kernel and encounter these 2 problems:
(1) No matter what parameters (C and gamma) I used, all the characters were always classified to 0 (zero). If I test with MNIST all of the digits are 9.
I hope someone with experience in OpenCV & SVM can explain to me. I know there're some other good frameworks for machine learning & image processing like ACCORD.NET, but I've already used C++ and it would be troublesome to turn the whole program into C# (OCR is only a part of it).
The version of OpenCV is 2.3.1.
(2) I moved this problem to another question as suggestion of etarion. If you have time please check it out: Visual Studio reports error C2664 with train method of SVM in openCV.
Upvotes: 3
Views: 4858
Reputation: 12152
The theory suggests that under the correct parameters an RBF kernel works at least as well as a linear kernel. Therefore I will list common sources of problems:
It is possible that you're having numerical difficulties. Have you normalized your data? Is every feature between 0 and 1? or -1 and 1? What is the numerical range of the actual decision values? What is the range of the feature values?
Is it possible that you're overestimating the performance of the linear classifier (i.e. test and train on the same data?)
Could it be that your multi class representation is somehow flawed. Does the same performance difference hold for a two class problem instead of a ten class problem?
Upvotes: 1
Reputation: 17151
As for the first part, it's very likely that your parameters are off. There's a train_auto method for automatic parameter estimation, you can extend the used parameter ranges if those don't yield nice results by passing custom parameter grids to the method (but try the default parameters first).
Upvotes: 0