Weird result with SURF comparison

Question

I'm trying to implement a traffic sign recognizer with OpenCV and SURF method. My problem is that i get random results (sometimes really accurate, sometimes obviously wrong) and i cant undertsand why. Here is how i implemented the comparison :

First i detect contours on my image
Then on each contour, i use SURF to find out if a traffic sign is inside and which traffic sign

The contour detection works perfectly well : using a gaussain blur and canny edge i manage to find a contour similar to this one :

enter image description here

Then i extract the image corresponding to this contour and i compare this image to traffic signs template image such as these :

enter image description here

The cvExtractSURF returns 189 descriptors for the contour image. Then i use the naiveNearestNeighbor method to find out similarities between my contour image and each template image.

Here are my results :

6/189 for the first template (which is the one i'm expecting to find)

92/189 for the second template (which is obviously very different in every way to the contour image)

I really dont understand these results…

Here is the list of the steps i perform :

Turn the contour image in grayscale
Turn the template image in grayscale
Equalize the histogram of the contour image (cvEqualizeHist)
Resize the template image to make it match the contour image
Blur the template image (cvSmooth)
Blur the contour image (cvSmooth)
Do a cvExtractSURF on the template image
Do a cvExtractSURF on the contour image
For each descriptor o the contour image i do a naiveNearestNeighbor
I store the number of "good" points

To evaluate the similarity between the 2 images i use the ratio :

number of goog points / total number of descriptors

P.S: For information i followed this tutorial : http://www.emgu.com/wiki/index.php/Traffic_Sign_Detection_in_CSharp

And used the find_obj sample of OpenCV to adapt it in C.

Latanius · Accepted Answer

SURF descriptors are fine for comparing richly textured images... I think there isn't enough texture in traffic signs for them.

When extracting descriptors, first "salient points" are located, in your case for example at the corners of the rectangularly shaped marks on both signs (the rectangle and the letter P), then local properties are collected for them. Like, how does a corner-of-a-rectangle look like, from close up, blurred and grayscale.

Then, these descriptors are matched to the corner-of-a-rectangle-s from the letter P. They aren't all that different... (as we're not taking any shape information into account). Maybe the corners of the letter P are a little bit more close to those of the "no entry" sign. Randomly.

Of course, all of this is just a speculation... the only way to figure out is to debug it thoroughly. Try displaying the images with little circles where the descriptors were found (circle size could depend on the scale the point was found at). Or put both of the images into one IplImage and draw lines between the matching descriptors. Something like this:

http://www.flickr.com/photos/22191989@N00/268039276

As for how to fix this... what about using the same shape matching method for the inside that you use for detecting the outside contours for the traffic sign? (For example, you could look for P shaped objects once a sign is found.)

Weird result with SURF comparison

Answers (2)

Related Questions