Glyph matching on Python

Question

I have the following specifications for a project:

Given a dataset (let's call it test dataset) of about 2000 binary images, where each image corresponds to a glyph, I want to find the best match of each test image, on a different dataset (let's call it reference dataset) which has about 6000 unique glyphs.

Here are some examples of glyphs:

Thus, for each image in test dataset I want to find the best match in reference dataset.

The images on both sets have slightly different dimensions, although it's just a matter of padding. And all the image glyphs contained in test dataset are also in reference dataset.

My first thought was to use a CNN in TensorFlow although, given that I have a huge number of classes (about 6000) there are big memory issues. Moreover, given that the glyphs are extremely similar on both datasets, it's kind of an overkill to use a CNN.

So what would be the most straightforward method to tackle this problem in Python without using neural networks?

Kris · Accepted Answer

Just on the top of mind. 1. Generate Features for the test and reference images. Try to use lets say SIFT features. 2. You can cluster now cluster the images since you would have vector representation of the images. Try to use k means with cosine distance 3. Now given a test image cluster the test image and find the cluster it belongs to. Compare the test image with the images in the clusters.

Please let me know if something is unclear.

Glyph matching on Python

Answers (1)

Related Questions