Length of each row of training matrix for SVM

Question

I have found an excellent/comprehensive post/answer related to this topic here using OpenCV and SVM with images. However, I have some questions I would like to clarify out from the answer in the link. (since i do not have enough reputation to write a comment).

What I've been doing: I am using OpenCV SVM for training. The features used for the training matrix are obtained by computing the normalized mean R,G and B values for each image.Thus, in the training matrix, there are 4 columns in each row(or in each image). These columns correspond to the labels(1 or 0), ****normalized mean in r channel****, g and b channel.

By the way, my original training file is a text file, which I will still convert to float[][], and evenutally into Mat object to feed into opencv's SVM. here's how the file looks like:

1 0.267053 0.321014 0.411933
1 0.262904 0.314294 0.422802
.
.
0 0.29101 0.337208 0.371782
0 0.261792 0.314494 0.423714

Apparently, this contradicts to the statement from the link which states that the size of each row must be equivalent to the size of the image. Is it a protocol or some kind of a rule? I just cannot make sense as to why it should(if it is).

My question is, in constructing the training matrix, does the length of the each row have to correspond to the area or size of the image? Where as in the training matrix I've made, the length of each row is only 4. Is this wrong?

In addition, does having only 3 features(3 columns) for training enough for classification/for SVM? Please guide me to the right path, I'm doubting if I should continue with this or if there's some other better approach to the problem.

I hope I'll get to understand more of the concepts behind the steps of SVM. Articles or related samples would be appreciated!

yutasrobot · Accepted Answer

Size of each row does not have to be equivalent of image size. It depends on what you have for features. Using mean values for image classification is not enough. Just think about how you classify objects when you look at a picture. You don't calculate mean values but you probably look at contours, connected areas, sometimes individual pixel values in the processing background of the brain.

So to get more features, I have a suggestion for you. Calculate each column's mean value for feature extraction part. This will be more useful probably.

And for another feature extraction you can use PCA. Normally you can give all pixel values in a row for training SVM but even for 200*200 image this makes 40.000 features, WOW, that is so much. You need to reduce this feature dimension without losing much information, this means retaining an acceptable percentage of variance. So PCA is used for this, reducing the feature space dimension and retaining the variance at an acceptable rate.

I will try to show you how you can reduce feature space with PCA. First you will need to acquire images, than roll images to a Mat variable row by row :

Reading csv:

void read_csv(const string& filename, vector& images, vector& labels, char separator = ';') 
{
    std::ifstream file(filename.c_str(), ifstream::in);
    if (!file) 
    {
        string error_message = "No valid input file was given, please check the given filename.";
        CV_Error(1, error_message);
    }
    string line, path, classlabel;
    while (getline(file, line)) 
    {
        stringstream liness(line);

        getline(liness, path, separator);
        getline(liness, classlabel);

        if(!path.empty() && !classlabel.empty()) 
        {
            Mat im = imread(path, 0);

            images.push_back(im);
            labels.push_back(atoi(classlabel.c_str()));
        }
    }
}

Rolling images row by row :

Mat rollVectortoMat(const vector &data) // data is vector of Mat images
{
   Mat dst(static_cast(data.size()), data[0].rows*data[0].cols, CV_32FC1);
   for(unsigned int i = 0; i < data.size(); i++)
   {
      Mat image_row = data[i].clone().reshape(1,1);
      Mat row_i = dst.row(i);                                       
      image_row.convertTo(row_i,CV_32FC1, 1/255.);
   }
   return dst;
}

MAIN

int main()
{

    PCA pca;

    vector images_train;
    vector images_test;
    vector labels_train;
    vector labels_test;

    read_csv("train1k.txt",images_train,labels_train);
    read_csv("test1k.txt",images_test,labels_test);

    Mat rawTrainData = rollVectortoMat(images_train);                       
    Mat rawTestData  = rollVectortoMat(images_test);                

    Mat trainLabels = getLabels(labels_train);
    Mat testLabels  = getLabels(labels_test);

    int pca_size = 500;

    Mat trainData(rawTrainData.rows, pca_size,rawTrainData.type());
    Mat testData(rawTestData.rows,pca_size,rawTestData.type());


    pca(rawTrainData,Mat(),CV_PCA_DATA_AS_ROW,pca_size);

    for(int i = 0; i < rawTrainData.rows ; i++)
        pca.project(rawTrainData.row(i),trainData.row(i));

    for(int i = 0; i < rawTestData.rows ; i++)
        pca.project(rawTestData.row(i),testData.row(i));

}

To summarize, you read a csv file which is like image_path;label . Than you roll images to a Mat variable row by row. You apply pca to reduce to 500 feature. I applied these PCA redcution to reduce 200*200 images (40000 features) to 500 feature size. Than I applied MLP to classify this. This testData and trainData variables can be used with SVM too. You can also check how to train it with MLP in my SO post :

OpenCV Neural Network Sigmoid Output

Length of each row of training matrix for SVM

Answers (2)

Related Questions