Reputation: 12142
I am trying to detect specific objects in images using Haar cascade in OpenCV.
Let's say I am interested in detecting stop signs in landscape images. When defining positive image samples for my training set, which would be the best kind of image: (a) full images with my object, (b) a medium crop or (c) a tight crop?
Similarly, what's best for negative images? Does this influence overfitting? I would also appreciate any other general tips from those with experience. Thanks.
Image ref: http://kaitou-ace.deviantart.com/art/Stop-sign-on-a-country-road-Michigan-271990933
Upvotes: 2
Views: 808
Reputation: 762
The best choice is (c) because (a) and (b) contain too many features, all around the border of the sign, that are not interesting for you.
Not only they are not useful but they can seriously compromise the performance of the algorithm.
In case (c) its aim is to recognize situations where in the current window there are the features you are looking for.
But what about (b) and (c)?
In those cases the algorithm has to detect interesting features just in a corner of the window (and unfortunately that corner could be everywhere) and at the same time to be consistent with all the infinite possibilities that could occur around that corner.
You would need a huge amount of samples and anyway, even if you finally manage to get an acceptable hit rate, the job of separating positives and negatives is so difficult that the running time would be very high.
As to negatives collection, ideally you should pick up images that reproduce what you think are the images against which your final detector will run.
For example if you think that indoor images are not interesting for this, just discard them. If you think that a certain kind of landscapes are the ones where you detector will run, just retain much of them.
But this is only theoretical, I feel that the improvement would be negligeble. Just collect as many images as you can, The number of different images, that really matters.
Upvotes: 2
Reputation: 6666
You only want features that you want to detect in your positive samples. So the C image would be correct for positive samples.
As for negative samples you want EVERYTHING else. Although that is obviously unrealistic if you are using your detector in a specific environment then training to detect that as negative is the right way to go. I.e. lots of pictures of landscapes etc (ones that don't have stop signs in)
Upvotes: 3