Reputation: 41
I'm currently having difficulties saving hashes to a txt file.
I am getting a hash from each image I download from reddit, and I am wishing to store this hash into a txt file if its not in the txt file already.
If the hash of the image is already in the txt file, it wont post it in the txt file again (no duplicate hashes) and it will delete the image.
However, i am having difficulty doing this, as it adds the same hash multiple times.
def checkImage():
database_file = "hash_database.txt"
for filename in os.listdir('NewImages//'):
upload = file_path + '\\' + filename
# Get hash of each image
for filename in os.listdir('NewImages//'):
new_images = imagehash.phash(Image.open('NewImages//' + filename))
with open(database_file, "r") as f:
read_database = f.readlines()
for line in read_database:
if line == str(new_images):
print("Delete this image, hash is already in database")
os.remove(upload)
else:
print("Save this image. Adding hash to database")
with open(database_file, "a") as f:
f.write(str(new_images) + "\n")
checkImage():
Picture of my database:
As you can see, there are multiple of the same hashes.
Any help at all would be greatly appreciated!
Upvotes: 0
Views: 827
Reputation: 33345
line
has a newline \n
at the end, so you're comparing "abcde"
to "abcde\n"
, which aren't equal.
You can use .strip()
to remove the newline:
for line in read_database:
line = line.strip()
if line == str(new_images):
...
UPDATE:
Also, the logic for the else
block is wrong. You're treating the hash as new if it doesn't match the current line in the file, but you should do that only if it doesn't match any line in the file.
Upvotes: 2