Reputation: 7457
I'm in the feasibility stage of a project and wanted to know whether the following was doable using Machine Vision
:
If I wanted to see if two files were identical, I would use a hashing function of sorts (e.g. sha1
or md5
) on the files and store the results in a database.
However, if I have two images where say image 1
is 90% quality and image 2
is 100% quality, then this will not work as they will have different hashes.
Using machine vision, is it possible to "look" at an image and create a signature from it, so that when another image is encountered, we can say "have we already got this image in the system", and if so, disregard the new image, and if not, save the image?
I know that you are able to perform Machine Vision comparison between two known images, e.g.:
https://www.pyimagesearch.com/2014/09/15/python-compare-two-images/
(there's a lot of code in there so I cannot simply paste in here for reference, unfortunately)
but an image by image comparison would be extremely expensive.
Thanks
Upvotes: 0
Views: 211
Reputation: 2956
I do not what you mean by 90% and 100%. Are they image compression quality using JPEG? Regardless of this, you can match images using many methods for example using image processing only approaches such as SIFT
, SURF
, BRISK
, ORB
, FREAK
or machine learning approaches such as Siamese networks
. However, they are heavy for simple computer to run (on my computer powered by core-i7 2670QM, from 100 to 2000 ms for a 2 mega pixel match), specially if you run them without parallelism ( programming without GPU, AVX, ...), specially the last one.
For hashing you may also use perceptual hash functions
. They are widely used in finding cases of online copyright infringement as well as in digital forensics
because of the ability to have a correlation between hashes so similar data can be found (for instance with a differing watermark) [1]. Also you can search copy move forgery
and read papers around it and see how similar images could be found.
Upvotes: 0
Reputation: 754
python provide the module called : imagehash :
imagehash - encodes the image which is commend bellow.
from PIL import Image
import imagehash
hash = imagehash.average_hash(Image.open('./image_1.png'))
print(hash)
# d879f8f89b1bbf
otherhash = imagehash.average_hash(Image.open('./image_2.png'))
print(otherhash)
# ffff3720200ffff
print(hash == otherhash)
# False
print(hash)
above is the python code which will print "true" if images are identical and "false" if images are not identical. Thanks.
Upvotes: 1