colla
colla

Reputation: 43

opencv neural network, incorrect predict

I'm trying to create a neural network in C++ with OpenCV. The aim is recognition of road signs. I have created the network in this way, but it predicts badly, because it returns strange results:

enter image description here

Sample images from the training selection look like this: examples

Can someone help?

  trainNN() {
    char* templates_directory[] = {
        "speed50ver1\\", 
        "speed60ver1\\", 
        "speed70ver1\\",
        "speed80ver1\\"
    };

    int const numFilesChars[]={ 213, 100, 385, 163};

    char const strCharacters[] = { '5', '6', '7', '8' };

    Mat trainingData; 
    Mat trainingLabels(0, 0, CV_32S);

    int const numCharacters = 4;  

    // load images from directory
    for (int i = 0; i != numCharacters; ++i) {
        int numFiles = numFilesChars[i];

        DIR *dir;
        struct dirent *ent;

        char* s1 = templates_directory[i];

        if ((dir = opendir (s1)) != NULL) {
            Size size(80, 80);

            while ((ent = readdir (dir)) != NULL) {
                string s = s1;
                s.append(ent->d_name);

                if(s.substr(s.find_last_of(".") + 1) == "jpg") {
                    Mat img = imread(s,0);
                    Mat img_mat;
                    resize(img, img_mat, size);
                    Mat new_img = img_mat.reshape(1, 1);
                    trainingData.push_back(new_img);
                    trainingLabels.push_back(i);

                }

            }
            int b = 0;
            closedir (dir);
        } else {
            /* could not open directory */
            perror ("");
        }
    }

    trainingData.convertTo(trainingData, CV_32FC1);

    Mat trainClasses(trainingData.rows, numCharacters, CV_32FC1);
    for( int i = 0; i !=  trainClasses.rows; ++i){
        int const labels = *trainingLabels.ptr<int>(i);
        auto train_ptr = trainClasses.ptr<float>(i);
        for(int k = 0; k != trainClasses.cols; ++k){
            *train_ptr = k != labels ? 0 : 1;
            ++train_ptr;
        }
    }

    int layers_d[] = { trainingData.cols, 10,  numCharacters};
    Mat layers(1, 3, CV_32SC1, layers_d);
    ann.create(layers, CvANN_MLP::SIGMOID_SYM, 1, 1);

    CvANN_MLP_TrainParams params = CvANN_MLP_TrainParams(
        // terminate the training after either 1000
        // iterations or a very small change in the
        // network wieghts below the specified value
        cvTermCriteria(CV_TERMCRIT_ITER+CV_TERMCRIT_EPS, 1000, 0.000001),

        // use backpropogation for training
        CvANN_MLP_TrainParams::BACKPROP,

        // co-efficents for backpropogation training
        // (refer to manual)
        0.1,
        0.1);

    int iterations = ann.train(trainingData, trainClasses, cv::Mat(), cv::Mat(), params);

    CvFileStorage* storage = cvOpenFileStorage( "neural_network_2.xml", 0, CV_STORAGE_WRITE );
    ann.write(storage,"digit_recognition");
    cvReleaseFileStorage(&storage);

}  


void analysis(char* file, bool a) {
    //trainNN(a);
    read_nn();


    // load image
    Mat img = imread(file,  0);

    Size my_size(80,80);
    resize(img, img, my_size);

    Mat r_img = img.reshape(1,1);

    r_img.convertTo(r_img, CV_32FC1);   

    Mat classOut(1,4,CV_32FC1); 

    ann.predict(r_img, classOut);

    double min1, max1;
    cv::Point min_loc, max_loc;
    minMaxLoc(classOut, &min1, &max1, &min_loc, &max_loc);
    int x = max_loc.x;


    //create windows
    namedWindow("Original Image", CV_WINDOW_AUTOSIZE);
    imshow("Original Image", img);

    waitKey(0); //wait for key press

    img.release();
    rr.release();

    destroyAllWindows(); //destroy all open windows
}

strange results: for this input speed80 answer is 3 (because i have only 4 classes - speed limit 50, 60, 70, 80). It's correct for speed limit 80 sign.results

But for the rest inputs results are incorrect. They are same for signs 50, 60, 70. max1 = min1 = 1.02631...(as on the first picture) It's strange.

Upvotes: 4

Views: 1133

Answers (2)

Aenimated1
Aenimated1

Reputation: 1624

I have adapted your code to train a classifier on 4 hand positions (since that's the image data I have). I kept your logic as similar as possible, only changing what was absolutely necessary to make it run on my Windows machine on my images. Long story short, there is nothing fundamentally wrong with your code - I don't see the failure mode you described.

One thing you left out was the code for read_nn(). I assume that just does something like the following: ann.load("neural_network_2.xml");

Anyway, my suspicion is that either your neural network is not converging at all or it's badly overfitting. Perhaps there's not enough variation in the training data. Are you running analysis() on separate test data that the ANN wasn't trained on? If so, is the ANN able to predict training data properly at least?

EDIT: OK, I just downloaded your image data and tried it out and saw the same behavior. After some analysis, it looks like your ANN is not converging. The training operation exits after only about 250 iterations, even if you specify only CV_TERMCRIT_ITER for the cvTermCriteria. After increasing your hidden layer size from 10 to 20, I saw a marked improvement, with successful classification on the training data for 212, 72, 94, and 143 of the images respectively to the classes (50, 60, 70, and 80). That's not very good, but it demonstrates that you're on the right track.

Basically, the network architecture is not expressive enough to adequately model the problem you're trying to solve, so the network weights never converge and it abandons the backprop early. For one class, you may see some success, but I believe that's largely a function of the lack of shuffling of training data. If it stops after having just trained on a couple hundred very similar images, it may be able to manage to classify those correctly.

In short, I would recommend doing the following:

  1. Build a way to test the results - e.g.: create a function to run prediction on all training data, and ideally set aside some images as a validation set in order to also confirm that the model is not overfitting the training data.
  2. Shuffle the training data prior to training. Otherwise, backprop will not converge as easily.
  3. Experiment with different architectures such as more than one hidden layer with varying sizes.

Really, this is a problem that would benefit dramatically from using a Convolutional Neural Net, but OpenCV's machine learning facilities are pretty limited. Ultimately, if you're serious about creating ANNs, you might want to investigate some more robust tools. I personally use Tensorflow, but I've heard good things about Theano as well.

Upvotes: 3

Carafini
Carafini

Reputation: 111

I've only implemented NN with OpenCV for boolean classification, but I think that for a task where you need to classify more than two distinct classes this might also apply:

"If you are using the default cvANN_MLP::SIGMOID_SYM activation function then the output should be in the range [-1,1], instead of [0,1], for optimal results."

So, where you do:

*train_ptr = k != labels ? 0 : 1;

You might want to try:

*train_ptr = k != labels ? -1 : 1;

Disregard if I'm way off track here.

Upvotes: 0

Related Questions