opencv neural network, incorrect predict

Question

I'm trying to create a neural network in C++ with OpenCV. The aim is recognition of road signs. I have created the network in this way, but it predicts badly, because it returns strange results:

Sample images from the training selection look like this: examples

Can someone help?

  trainNN() {
    char* templates_directory[] = {
        "speed50ver1\", 
        "speed60ver1\", 
        "speed70ver1\",
        "speed80ver1\"
    };

    int const numFilesChars[]={ 213, 100, 385, 163};

    char const strCharacters[] = { '5', '6', '7', '8' };

    Mat trainingData; 
    Mat trainingLabels(0, 0, CV_32S);

    int const numCharacters = 4;  

    // load images from directory
    for (int i = 0; i != numCharacters; ++i) {
        int numFiles = numFilesChars[i];

        DIR *dir;
        struct dirent *ent;

        char* s1 = templates_directory[i];

        if ((dir = opendir (s1)) != NULL) {
            Size size(80, 80);

            while ((ent = readdir (dir)) != NULL) {
                string s = s1;
                s.append(ent->d_name);

                if(s.substr(s.find_last_of(".") + 1) == "jpg") {
                    Mat img = imread(s,0);
                    Mat img_mat;
                    resize(img, img_mat, size);
                    Mat new_img = img_mat.reshape(1, 1);
                    trainingData.push_back(new_img);
                    trainingLabels.push_back(i);

                }

            }
            int b = 0;
            closedir (dir);
        } else {
            /* could not open directory */
            perror ("");
        }
    }

    trainingData.convertTo(trainingData, CV_32FC1);

    Mat trainClasses(trainingData.rows, numCharacters, CV_32FC1);
    for( int i = 0; i !=  trainClasses.rows; ++i){
        int const labels = *trainingLabels.ptr(i);
        auto train_ptr = trainClasses.ptr(i);
        for(int k = 0; k != trainClasses.cols; ++k){
            *train_ptr = k != labels ? 0 : 1;
            ++train_ptr;
        }
    }

    int layers_d[] = { trainingData.cols, 10,  numCharacters};
    Mat layers(1, 3, CV_32SC1, layers_d);
    ann.create(layers, CvANN_MLP::SIGMOID_SYM, 1, 1);

    CvANN_MLP_TrainParams params = CvANN_MLP_TrainParams(
        // terminate the training after either 1000
        // iterations or a very small change in the
        // network wieghts below the specified value
        cvTermCriteria(CV_TERMCRIT_ITER+CV_TERMCRIT_EPS, 1000, 0.000001),

        // use backpropogation for training
        CvANN_MLP_TrainParams::BACKPROP,

        // co-efficents for backpropogation training
        // (refer to manual)
        0.1,
        0.1);

    int iterations = ann.train(trainingData, trainClasses, cv::Mat(), cv::Mat(), params);

    CvFileStorage* storage = cvOpenFileStorage( "neural_network_2.xml", 0, CV_STORAGE_WRITE );
    ann.write(storage,"digit_recognition");
    cvReleaseFileStorage(&storage);

}  


void analysis(char* file, bool a) {
    //trainNN(a);
    read_nn();


    // load image
    Mat img = imread(file,  0);

    Size my_size(80,80);
    resize(img, img, my_size);

    Mat r_img = img.reshape(1,1);

    r_img.convertTo(r_img, CV_32FC1);   

    Mat classOut(1,4,CV_32FC1); 

    ann.predict(r_img, classOut);

    double min1, max1;
    cv::Point min_loc, max_loc;
    minMaxLoc(classOut, &min1, &max1, &min_loc, &max_loc);
    int x = max_loc.x;


    //create windows
    namedWindow("Original Image", CV_WINDOW_AUTOSIZE);
    imshow("Original Image", img);

    waitKey(0); //wait for key press

    img.release();
    rr.release();

    destroyAllWindows(); //destroy all open windows
}

strange results: for this input answer is 3 (because i have only 4 classes - speed limit 50, 60, 70, 80). It's correct for speed limit 80 sign.

But for the rest inputs results are incorrect. They are same for signs 50, 60, 70. max1 = min1 = 1.02631...(as on the first picture) It's strange.

Aenimated1 · Accepted Answer

I have adapted your code to train a classifier on 4 hand positions (since that's the image data I have). I kept your logic as similar as possible, only changing what was absolutely necessary to make it run on my Windows machine on my images. Long story short, there is nothing fundamentally wrong with your code - I don't see the failure mode you described.

One thing you left out was the code for read_nn(). I assume that just does something like the following: ann.load("neural_network_2.xml");

Anyway, my suspicion is that either your neural network is not converging at all or it's badly overfitting. Perhaps there's not enough variation in the training data. Are you running analysis() on separate test data that the ANN wasn't trained on? If so, is the ANN able to predict training data properly at least?

EDIT: OK, I just downloaded your image data and tried it out and saw the same behavior. After some analysis, it looks like your ANN is not converging. The training operation exits after only about 250 iterations, even if you specify only CV_TERMCRIT_ITER for the cvTermCriteria. After increasing your hidden layer size from 10 to 20, I saw a marked improvement, with successful classification on the training data for 212, 72, 94, and 143 of the images respectively to the classes (50, 60, 70, and 80). That's not very good, but it demonstrates that you're on the right track.

Basically, the network architecture is not expressive enough to adequately model the problem you're trying to solve, so the network weights never converge and it abandons the backprop early. For one class, you may see some success, but I believe that's largely a function of the lack of shuffling of training data. If it stops after having just trained on a couple hundred very similar images, it may be able to manage to classify those correctly.

In short, I would recommend doing the following:

Build a way to test the results - e.g.: create a function to run prediction on all training data, and ideally set aside some images as a validation set in order to also confirm that the model is not overfitting the training data.
Shuffle the training data prior to training. Otherwise, backprop will not converge as easily.
Experiment with different architectures such as more than one hidden layer with varying sizes.

Really, this is a problem that would benefit dramatically from using a Convolutional Neural Net, but OpenCV's machine learning facilities are pretty limited. Ultimately, if you're serious about creating ANNs, you might want to investigate some more robust tools. I personally use Tensorflow, but I've heard good things about Theano as well.

opencv neural network, incorrect predict

Answers (2)

Related Questions