Reputation: 3615
The website is sort of a gallery. But to prevent duplicate entries. I want to match them. It wont be 100% bulletproof image match, but for my needs its absolutely perfect solution.
Only problem is, that I don't know the correct way, to get a sha1 from Imagick $image
object.
This is what I kinda have right now, and it does produce a hash. But it doesn't match with the ones, that I have in the server. And in the server, its the same course of optimizing the image down to the smallest thumbnail. Except, ad the end there is file_put_contents($root, $image);
at the end of each image manipulation block. But I don't think the problem is there, I think the problem might be, that I'm missing something from the $image
object inside the sha1()
function. Like something like sha1($image->rendercurrentimage())
..
<?
$img_url = 'someimgfile.jpg';
# Step 1 = Original file hash - This is all ok
$source_hash = sha1_file($img_url);
$image = new Imagick($img_url);
# file_put_contents($source_root, $image);
$image->gaussianBlurImage(0, 0.05);
$image->setCompression(Imagick::COMPRESSION_JPEG);
$image->setCompressionQuality(90);
$image->setImageFormat('jpeg');
$image->scaleImage(215, 0);
# file_put_contents($thumbnail_root, $image);
# Step 2 = Get the thumbnail hash - results in a non matching hash vs. DB hash
$thumbnail_hash = sha1($image);
$image->setCompressionQuality(75);
$image->cropThumbnailImage(102, 102);
# file_put_contents($smallthumbnail_root, $image);
# Step 3 = Get the even smaller thumbnail hash - results in a non matching hash vs. DB hash
$smallthumbnail_hash = sha1($image);
# now query to DB to check against all 3 hashes: $source_hash | $thumbnail_hash | $smallthumbnail_hash
# DB has lets say 1000 images, with source hash, thumbnail hash and small thumbnail hash saved in them
# NOTE: The process of scaling images as they enter the DB, is exactly the same, expect there are file_put_contents($root, $image); in between them.. I put them in and commented out, to show you the locations
As I said above. I have the match-against hashes located in the server 3 ways. So original, thumbnail and even smaller thumbnail. And Those were created with sha1_file()
function. I would like to mimic the hole process basically, but not to save the file in the $root, in case its a duplicate and there for will be denied and redirected to the matched-against entry.
If you are wondering, why I want to match the thumbnails. Its because, my tests shows, that if the original file might be different in size and etc. Then the thumbnails created, matched kinda well. Or am I wrong? If I have the same image, in 3 different sizes. And I scale them down to let say 100px width. Will their hashes be the same?
Conclusion
I had to rewrite the original image handler a bit. But basically I think there was still a piece missing in my code like $image->stripImage();
. Or something. While it started getting better results. It seems the most optimal way to keep hashes in the server is to:
$hash = sha1(base64_encode($image->getImageBlob()));
My tests also confirmed, that file_put_contents($thumbnail_root, $image);
and then getting the hash via sha1_file($image_root);
will not change the hash values.
I also got more matching results from bigger images scaled down to thumb sizes.
Upvotes: 2
Views: 1182
Reputation: 191
As your problem is that you don't want to create a file on the file system for each step that you are going through then I would suggest that you grab the blob content for the steps and create a hash of that. For example:
<?php
//quick and dirty image creation to demonstrate my point
$image = new Imagick();
$image->newImage(100, 100, new ImagickPixel('red'));
$image->setImageFormat('png');
//base64 encode our blob and then generate a sha1 hash
$thumbnail = base64_encode( $image->getImageBlob() );
echo sha1($thumbnail);
If you are trying to match two different (original) sized images against each other then you may come up against resampling problems. e.g. I have a picture of a monkey that is 200px square, another, seemingly identical that is 400px square, if I do a resample down to 200px the images will not always match.
Upvotes: 1
Reputation: 691
Just use this:
$sha1 = sha1_file($img_url);
But be careful to get the sha1 before processing the image! All your hashes should be generated based on the images as the users uploaded them so you can compare them with the hashes of future images without the need of processing them first.
Note! The hash will change even if you rescale the image, keeping proportions. Even if you open the file in a text editor and add a blank space, the hash changes.
Your idea with scaling the image to the same width might work, but only if they were scaled using the same function or parameters. It's not 100% trustable.
Upvotes: 0