Jedi Knight
Jedi Knight

Reputation: 649

Python/Bash: comparing local and remote docker images sha256 hashes

The main goal: I have a very size-limited server and I can't afford docker pull of new image version while old layers are on server, so when new image is available, I remove the local one. But I don't want to do it when image wasn't updated as it takes several minutes to pull. I am trying to write a script that will compare remote and local sha256 hashes and act based on their difference.

1st part of question:

What I have in python:

import docker


def get_local_image_sha256(image_name):
    client = docker.from_env()
    image = client.images.get(image_name)
    return image.id


def get_remote_image_sha256(image_name):
    client = docker.from_env()

    try:
        manifest = client.images.get(image_name).attrs['RepoDigests'][0]
        # Extract SHA256 hash from the manifest
        sha256_hash = manifest.split('@')[1]
        return sha256_hash
    except docker.errors.ImageNotFound:
        print(f"Image '{image_name}' not found.")
        return None


if __name__ == "__main__":
    image_name = "cr.yandex/crp110tk8f32a48oaeqo/ecr-server:latest"

    local_sha256 = get_local_image_sha256(image_name)
    remote_sha256 = get_remote_image_sha256(image_name)

    if local_sha256 and remote_sha256:
        print(f"Local Image SHA256: {local_sha256}")
        print(f"Remote Image SHA256: {remote_sha256}")
    else:
        print("Failed to obtain SHA256 hashes.")

It outputs:

Local Image SHA256: sha256:3995accefa763b49e743afb5a72a43be7cb0eb1acd14475f40496c002c6063d7
Remote Image SHA256: sha256:d80f38450a7ca2785afa9f18d790d7ff878dc8897dfab3af14e97983ab1e329e

Latest image is really ...9e in the cloud, but I don't know where ...d7 came from. When I use docker pull cr.yandex/crp110tk8f32a48oaeqo/ecr-server:latest just after script it outputs:

Pulling from crp110tk8f32a48oaeqo/ecr-server
Digest: sha256:d80f38450a7ca2785afa9f18d790d7ff878dc8897dfab3af14e97983ab1e329e
Status: Image is up to date for cr.yandex/crp110tk8f32a48oaeqo/ecr-server:latest
cr.yandex/crp110tk8f32a48oaeqo/ecr-server:latest

Why it says hashes are different in python but docker pull says image is latest?

2nd part:

Originally I tried in bash with script

LATEST_IMAGE_HASH_RAW=$(docker manifest inspect cr.yandex/crp110tk8f32a48oaeqo/ecr-server:latest -v | jq -r .Descriptor.digest)
IMAGE_HASH_RAW=$(docker inspect --format='{{index .RepoDigests 0}}' cr.yandex/crp110tk8f32a48oaeqo/ecr-server:latest)
IMAGE_HASH_RAW="${IMAGE_HASH_RAW#*@}"

if [[ "$IMAGE_HASH_RAW" == "$LATEST_IMAGE_HASH_RAW" ]]; then
  echo "$IMAGE_HASH_RAW is latest hash, skipping updating"
else
  echo "New hash is available, $LATEST_IMAGE_HASH_RAW, reinstalling (old one is $IMAGE_HASH_RAW)"
fi

And it managed to output:

New hash is available, sha256:d80f38450a7ca2785afa9f18d790d7ff878dc8897dfab3af14e97983ab1e329e, old is sha256:d80f38450a7ca2785afa9f18d790d7ff878dc8897dfab3af14e97983ab1e329e, reinstalling

What's wrong with string comparation in bash? Script seemed to handle case when new image is available, but it doesn't do what I want when it's not...

3rd part:

In bash scipt I used docker inspect --format='{{index .RepoDigests 0}}' cr.yandex/crp110tk8f32a48oaeqo/ecr-server:latest, which outputs ...9e but there is another suggestion, docker inspect --format='{{.Id}}' cr.yandex/crp110tk8f32a48oaeqo/ecr-server:latest, which outputs ...d7. Why are they different and which one is correct for local latest hash?

Upvotes: 0

Views: 221

Answers (1)

Jedi Knight
Jedi Knight

Reputation: 649

Thanks to the comment @Obaskly, it seems I figured out the situation.

  1. As he mentioned, my python script was comparing two different types of hashes: the local image ID and the remote image digest, so it shouldn't be used for checking if image is up-to-date

  2. Instead of comparing hashes in Bash, I wrote a python script with the most detailed debug possible and in Bash I compare using exit codes as string comparison doesn't work there as intended:

string_comparer.py:

import sys
import datetime
from itertools import zip_longest


def postprocess_string(s):
    return s.lower().strip()


def log_equal_strings(string1, string2):
    current_datetime = datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S")
    log_message = f"{current_datetime}: {string1} is equal to {string2}\n"
    with open("comparison_log.txt", "a") as log_file:
        log_file.write(log_message)


def log_different_strings(string1, string2, position, char1, char2):
    current_datetime = datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S")
    log_message = f"{current_datetime}: {string1} is not equal to {string2}, at position {position} first string has char {char1} ({ord(char1)}), second has char {char2} ({ord(char2)})\n"
    with open("comparison_log.txt", "a") as log_file:
        log_file.write(log_message)


if __name__ == "__main__":
    if len(sys.argv) != 3:
        print("Usage: python compare_strings.py <string1> <string2>")
        sys.exit(1)

    string1 = postprocess_string(sys.argv[1])
    string2 = postprocess_string(sys.argv[2])

    if string1 == string2:
        print("Same!")
        log_equal_strings(string1, string2)
        sys.exit(0)
    else:
        print("Different!")
        for position, (char1, char2) in enumerate(zip_longest(string1, string2, fillvalue=' '), start=1):
            if char1 != char2:
                log_different_strings(string1, string2, position, char1, char2)
                break
        sys.exit(1)

And Bash script turned into:

LATEST_IMAGE_HASH_RAW=$(docker manifest inspect cr.yandex/crp110tk8f32a48oaeqo/ecr-server:latest -v | jq -r .Descriptor.digest)
IMAGE_HASH_RAW=$(docker inspect --format='{{index .RepoDigests 0}}' cr.yandex/crp110tk8f32a48oaeqo/ecr-server:latest)
IMAGE_HASH_RAW="${IMAGE_HASH_RAW#*@}"

HASHES_EQUAL_RESULT=$(/usr/bin/python3 /home/pchela/string_comparer.py "$IMAGE_HASH_RAW" "$LATEST_IMAGE_HASH_RAW")

if [ $? -eq 0 ]; then
  echo "$IMAGE_HASH_RAW is latest hash, skipping updating"
else
  echo "Output of comparer is $HASHES_EQUAL_RESULT: new hash is available, $LATEST_IMAGE_HASH_RAW, reinstalling (old one is $IMAGE_HASH_RAW)"
fi

Seems now it works as intended.

  1. docker inspect --format='{{index .RepoDigests 0}}' cr.yandex/crp110tk8f32a48oaeqo/ecr-server:latest is what I wanted (local image digest).

Upvotes: 0

Related Questions