How to train a svm for classifying images of english alphabet?

Question

My objective is to detected text in an image and recognize them. I have achieved detecting characters using stroke width transform. What to do to recognize them? As per my knowledge, I thought of training the svm with my dataset of letters of different fonts[images] by detecting feature point and extracting feature vectors from each and every image.[I have used SIFT Feature vector,did build the dictionary using kmean clusetering and all].

I have detected a character before, i will extract the sift feature vector for this character . and i thought of feeding this into the svm prediction function.

I dont know how to recognize using svm. I am confused! Help me and correct me where ever I went wrong with concept..

I followed this turorial for recognizing part. Can this turotial can be applicable to recognize characters. http://www.codeproject.com/Articles/619039/Bag-of-Features-Descriptor-on-SIFT-Features-with-O

lightalchemist · Accepted Answer

SVM is a supervised classifier. To use it, you will need to have training data that is of the type of objects you are trying to recognize.

Step 1 - Prepare training data

The training data consists of pairs of feature vectors and their corresponding class labels. In your case, it appears that you have extracted a SIFT-based "Bag-of-word" (BOW) feature vector for the characters you detected. So, for your training data, you will need to find many examples of the different characters, extract this feature vector for each of them, and associate them with a label (sometimes called a class label, and typically an integer) which you will perhaps map to a textual description (for e.g., the number 0 could be mapped to the character 'a', and so on.)

Step 2 - Training the classifier

The SVM classifier takes in as input an array/Mat of feature vectors (one per row) and their associated labels. Tune the parameters of the SVM (i.e., the regularization parameter C, and if applicable, any other parameters for kernels) on a separate validation set.

Step 3 - Predict for unseen data

At test time, given a sample that was not seen by the SVM during training, you compute a feature vector (your SIFT-based BOW vector) for the sample. Pass this feature vector to the SVM's predict function, and it will return you an integer. Remember earlier when preparing your training data, you have associated an integer with each label? This is the label predicted by the SVM for this sample. You can then map this label to a character. For e.g., if you have associated 0 with 'a', 1 with 'b' etc., you can use a vector/hashmap to map the integer to its textual counterpart.

Additional Notes

You can check out OpenCV's SVM tutorial here for details.

NOTE: Often, for beginners, the hardest part (after getting the data) is tuning the classifier. My advice is first try a simple classifier (for e.g., a linear SVM) which has few parameters to tune. A decent one would be the linear SVM, which only requires you to adjust one parameter C. Once you manage to get somewhat decent results (which gives some assurance that the rest of your code is working) you can move on to more "sophisticated" classifiers.

Lastly, the training data and feature vectors you extract are very important. The training data must be "similar" to the test data you are trying to predict. For e.g., if you are predicting characters found in road signs which comes with different fonts, lighting conditions, and pose differences, then using training data consisting of characters taken from say a newspaper/book archive may not give you good results. This is an issue of domain adaptation in machine learning.

How to train a svm for classifying images of english alphabet?

Answers (1)

Step 1 - Prepare training data

Step 2 - Training the classifier

Step 3 - Predict for unseen data

Additional Notes

Related Questions