Vether
Vether

Reputation: 65

Searching for objects from database on image

Let's suppose I have database with thousands of images with different forms and sizes (smaller than 100 x 100px) and it's guaranted that every of images shows only one object - symbol, logo, road sign, etc. I would like to be able to take any image ("my_image.jpg") from the Internet and answer the question "Do my_image contains any object (object can be resized, but without deformations) from my database?" - let's say with 95% reliability. To simplify my_images will have white background.

I was trying use imagehash (https://github.com/JohannesBuchner/imagehash), which would be very helpful, but to get rewarding results I think I have to calculate (almost) every possible hash of my_image - the reason is I don't know object size and location on my_image:

hash_list = []
MyImage = Image.open('my_image.jpg')

for x_start in range(image_width):
    for y_start in range(image_height):
        for x_end in range(x_start, image_width):
            for y_end in range(y_start, image_height):
                hash_list.append(imagehash.phash(MyImage.\
                crop(x_start, y_start, x_end, y_end)))

...and then try to find similar hash in database, but when for example image_width = image_height = 500 this loops and searching will take ages. Of course I can optymalize it a little bit but it still looks like seppuku for bigger images:

MIN_WIDTH = 30
MIN_HEIGHT = 30
STEP = 2

hash_list = []
MyImage = Image.open('my_image.jpg')

for x_start in range(0, image_width - MIN_WIDTH, STEP):
    for y_start in range(0, image_height - MIN_HEIGHT, STEP):
        for x_end in range(x_start + MIN_WIDTH, image_width, STEP):
            for y_end in range(y_start + MIN_HEIGHT, image_height, STEP):
                hash_list.append(...)

I wonder if there is some nice way to define which parts of my_image are profitable to calculate hashes - for example cutting edges looks like bad idea. And maybe there is an easier solve? It will be great if the program could give the answer in max 20 minutes. I would be gratefull for any advice.

PS: sorry for my English :)

Upvotes: 1

Views: 173

Answers (2)

Vether
Vether

Reputation: 65

After all I found solution that looks really nice for me and maybe it will be useful for someone else:

I'm using SIFT to detect "best candidates" from my_image:

def multiscale_template_matching(template, image):
    results = []
    for scale in np.linspace(0.2, 1.4, 121)[::-1]:
        res = imutils.resize(image, width=int(image.shape[1] * scale))
        r = image.shape[1] / float(res.shape[1])
        if res.shape[0] < template.shape[0] or res.shape[1] < template.shape[1];
           break

        ## bigger correlation <==> better matching
        ## template_mathing uses SIFT to return best correlation and coordinates
        correlation, (x, y) = template_matching(template, res)
        coordinates = (x * r, y * r)
        results.appent((correlation, coordinates, r))

    results.sort(key=itemgetter(0), reverse=True)
    return results[:10]

Then for results I'm calculating hashes:

ACCEPTABLE = 10

def find_best(image, template, candidates):
    template_hash = imagehash.phash(template)
    best_result = 50  ## initial value must be greater than ACCEPTABLE
    best_cand = None

    for cand in candidates:
        cand_hash = get_hash(...)
        hash_diff = template_hash - cand_hash
        if hash_diff < best_result:
            best_result = hash_diff
            best_cand = cand

    if best_result <= ACCEPTABLE:
        return best_cand, best_result
    else:
        return None, None

If result < ACCEPTABLE, I'm almost sure the answer is "GOT YOU!" :) This solve allows me to compare my_image with 1000 of objects in 7 minutes.

Upvotes: 0

kunal18
kunal18

Reputation: 1926

This looks like an image retrieval problem to me. However, in your case, you are more interested in a binary YES / NO answer which tells if the input image (my_image.jpg) is of an object which is present in your database.

The first thing which I can suggest is that you can resize all the images (including input) to a fixed size, say 100 x 100. But if an object in some image is very small or is present in a specific region of image (for e.g., top left) then resizing can make things worse. However, it was not clear from your question that how likely this is in you case.

About your second question for finding out the location of object, I think you were considering this because your input images are of large size, such as 500 x 500? If so, then resizing is better idea. However, if you asked this question because objects a localized to particular regions in images, then I think you can compute a gradient image which will help you to identify background regions as follows: since background has no variation (complete white) gradient values will be zero for pixels belonging to background regions.

Rather than calculating and using image hash, I suggest you to read about bag-of-visual-words (for e.g., here) based approaches for object categorization. Although your aim is not to categorize objects, but it will help you come up with a different approach to solve your problem.

Upvotes: 1

Related Questions