Reputation: 1711
I modified the MNIST example and when I train it with my 3 image classes it returns an accuracy of 91%. However, when I modify the C++ example with a deploy prototxt file and labels file, and try to test it on some images it returns a prediction of the second class (1 circle) with a probability of 1.0 no matter what image I give it - even if it's images that were used in the training set. I've tried a dozen images and it consistently just predicts the one class.
To clarify things, in the C++ example I modified I did scale the image to be predicted just like the images were scaled in the training stage:
img.convertTo(img, CV_32FC1);
img = img * 0.00390625;
If that was the right thing to do, then it makes me wonder if I've done something wrong with the output layers that calculate probability in my deploy_arch.prototxt file.
Upvotes: 4
Views: 2330
Reputation: 1677
I think you have forgotten to scale the input image during classification time, as can be seen in line 11 of the train_test.prototxt file. You should probably multiply by that factor somewhere in your C++ code, or alternatively use a Caffe layer to scale the input (look into ELTWISE or POWER layers for this).
EDIT:
After a conversation in the comments, it turned out that the image mean was mistakenly being subtracted in the classification.cpp file whereas it was not being subtracted in the original training/testing pipeline.
Upvotes: 3
Reputation: 11
Are your train classes balanced? You may get to a stacked network on a prediction of one major class. In order to find the issue I suggest to output the train prediction during training compared to predictions with the forward example on same train images from a different class.
Upvotes: 1