Reputation: 61
I created this program to calculate the sha256 or sha512 hash of a given file and digest calculations to hex.
It consists of 5 files, 4 are custom modules and 1 is the main.
I have two functions in different modules but the only difference in these functions is one variable. See below:
From sha256.py
def get_hash_sha256():
global sha256_hash
filename = input("Enter the file name: ")
sha256_hash = hashlib.sha256()
with open(filename, "rb") as f:
for byte_block in iter(lambda: f.read(4096),b""):
sha256_hash.update(byte_block)
# print("sha256 valule: \n" + Color.GREEN + sha256_hash.hexdigest())
print(Color.DARKCYAN + "sha256 value has been calculated")
color_reset()
From sha512.py
def get_hash_sha512():
global sha512_hash
filename = input("Enter the file name: ")
sha512_hash = hashlib.sha512()
with open(filename, "rb") as f:
for byte_block in iter(lambda: f.read(4096),b""):
sha512_hash.update(byte_block)
# print("sha512 valule: \n" + Color.GREEN + sha512_hash.hexdigest())
print(Color.DARKCYAN + "sha512 value has been calculated")
color_reset()
These functions are called in my simple_sha_find.py file:
def which_hash():
sha256_or_sha512 = input("Which hash do you want to calculate: sha256 or sha512? \n")
if sha256_or_sha512 == "sha256":
get_hash_sha256()
verify_checksum_sha256()
elif sha256_or_sha512 == "sha512":
get_hash_sha512()
verify_checksum_sha512()
else:
print("Type either sha256 or sha512. If you type anything else the program will close...like this.")
sys.exit()
if __name__ == "__main__":
which_hash()
As you can see, the functions that will be called are based on the users input. If the user types sha256, then it triggers the functions from sha256.py, but if they type sha512 then they trigger the functions from sha512.py
The application works, but I know I can make it less redundant but I do not know how.
How can I define the get_hash_sha---() and verify_checksum_sha---() functions once and they perform the appropriate calculations based on whether the user chooses sha256 or sha512?
I have performed a few variations of coding this program.
I have created it as one single file as well as creating different modules and calling functions from these modules.
In either case I've had the repetition but I know that tends to defeat the purpose of automation.
Upvotes: 5
Views: 197
Reputation: 61
This is what I came up with after studying your responses.
Because I am learning, I wanted to integrate aspects of each answer that was foreign to me.
As you will see I condensed the files from 5 to 3. I removed the global variables. Utilized the Enum module. And most pertinently, removed the repetition of similar blocks of code
Here is a link to the final product let me know what you think and/or where I can improve. Just found out how to post the whole block of code.
colors.py
class Color():
PURPLE = '\033[95m'
CYAN = '\033[96m'
DARKCYAN = '\033[36m'
BLUE = '\033[54m'
GREEN = '\033[92m'
YELLOW = '\033[93m'
RED = '\033[91m'
BOLD = '\033[91m'
UNDERLINE = '\033[4m'
END = '\033[0m'
def color_reset():
print(Color.END)
simple_sha_find.py
"""Module providing definition for calculating, digesting, and verifying hash with checksum"""
from slim_sha import which_hash
if __name__ == "__main__":
which_hash()
slim_sha.py
import sys
import hashlib
from enum import Enum
from colors import Color, color_reset
class HashType(Enum):
SHA256 = 'sha256'
SHA512 = 'sha512'
def get_hash(hash_type):
if hash_type == HashType.SHA256:
hash_obj = hashlib.sha256()
elif hash_type == HashType.SHA512:
hash_obj = hashlib.sha512()
else:
raise ValueError("Invalid hash type. Please choose 'sha256'or'sha512'")
file_name = input("Enter the filename: ")
try:
with open(file_name,"rb") as f:
for byte_block in iter(lambda: f.read(4096), b""):
hash_obj.update(byte_block)
print(Color.DARKCYAN + f"{hash_type} value has been calculated")
color_reset()
get_hash.hash_digested = hash_obj.hexdigest()
return get_hash.hash_digested
except FileNotFoundError:
print(f"File '{file_name}")
def which_hash():
sha_type_input = input("Which hash do you want to calculate? sha256 OR sha512? \n")
try:
sha_type = HashType(sha_type_input)
get_hash(sha_type)
verify_checksum()
except ValueError:
print("Type " + Color.UNDERLINE + "sha256" + Color.END + " or " + Color.UNDERLINE + "sha512")
def verify_checksum():
"""Function for comparing calcuated hash with hash provided by developer"""
given_checksum = input("Enter Checksum Provided by Authorized Distrubutor or Developer... \n")
print(Color.PURPLE + "You entered: " + given_checksum + Color.END)
print("Calculated : " + Color.GREEN + get_hash.hash_digested)
if given_checksum == get_hash.hash_digested:
safe_results()
else:
bad_results()
def safe_results():
safe_result = (Color.BOLD + Color.GREEN + "Checksum Verfied! File is OK.")
print(safe_result)
color_reset()
sys.exit()
def bad_results():
bad_result = (Color.BOLD + Color.RED + "WARNING!!! Checksum is NOT verified. Verify checksum entry with the checuksum source. Verify correct file or package. This is a potentially harmful file or package! Do not proceed! Notify developer or distributor if correct software is being checked and teh calculated checksum continues to not match checksum from developer or distributor.")
print(bad_result)
color_reset()
sys.exit()
Upvotes: 0
Reputation: 189297
You can refactor the functions to make the type of hash a parameter. Probably also avoid the use of global variables, and leave any interactive I/O to the calling code.
I have also changed the code to raise an error when there is a problem. Merely printing an error message is fine for very simple programs, but reusable code needs to properly distinguish between success and failure.
def get_hash(hash_type, filename):
if hash_type == 'sha256':
hash_obj = hashlib.sha256()
elif hash_type == 'sha512':
hash_obj = hashlib.sha512()
else:
raise ValueError("Invalid hash type. Please choose 'sha256' or 'sha512'")
# Don't trap the error
with open(filename,"rb") as f:
for byte_block in iter(lambda: f.read(4096), b""):
hash_obj.update(byte_block)
# Return the result
return hash_obj.hexdigest()
def which_hash():
sha_type = input("Which hash do you want to calculate: sha256 or sha512? \n").lower()
if sha_type in ['sha256', 'sha512']:
filename = input("File name: ")
digest = get_hash(sha_type, filename)
print(f"{Color.DARKCYAN}{hash_type} value has been calculated")
color_reset()
verify_checksum(digest, sha_type, filename)
else:
raise ValueError("Type sha256 or sha512")
This sort of "case and call" is precisely what object-oriented programming was designed to avoid, but perhaps it's too early in your programming journey to tackle that topic.
Upvotes: 2
Reputation: 10133
You can give hashlib.file_digest
the algorithm name as a string.
import hashlib
options = 'sha256', 'sha512'
# Choose algorithm
opts = ' or '.join(options)
alg = input(f"Which hash do you want to calculate: {opts}? \n")
if alg not in options:
print(f"Type either {opts}. If you type anything else the program will close...like this.")
sys.exit()
# Choose file and hash it
filename = input("Enter the file name: ")
with open(filename, "rb") as f:
digest = hashlib.file_digest(f, alg)
print(f"{alg} value has been calculated")
Upvotes: 2
Reputation: 26825
You can generalise the function that generates the hash by passing the relevant hashing function as an argument.
Something like this:
from hashlib import sha256, sha512
from typing import Callable
HASH_MAP: dict[str, Callable] = {"sha256": sha256, "sha512": sha512}
CHUNK = 4096
def make_hash(filename: str, hash_function: Callable) -> str:
hf = hash_function()
with open(filename, "rb") as data:
while buffer := data.read(CHUNK):
hf.update(buffer)
return hf.hexdigest()
def main():
filename = input("Enter filename: ")
func = input(f"Enter hash type {tuple(HASH_MAP)}: ")
if hfunc := HASH_MAP.get(func):
print(make_hash(filename, hfunc))
else:
print("Invalid hash type selection")
if __name__ == "__main__":
main()
If you subsequently want to add more hashing algorithms you just need to edit the HASH_MAP dictionary appropriately. No other code would need to change
Upvotes: 2
Reputation: 1261
You could union these 2 functions into a single one:
import hashlib
def get_hash(hash_type):
if hash_type == 'sha256':
hash_obj= hashlib.sha256()
elif hash_type == 'sha512':
hash_obj = hashlib.sha512()
else:
print("Invalid hash type.Please choose 'sha256'or'sha512'")
return
filename = input("Enter the fileename: ")
try:
with open(filename,"rb") as f:
for byte_block in iter(lambda: f.read(4096), b""):
hash_obj.update(byte_block)
print(Color.DARKCYAN + f"{hash_type} value has been calculated")
color_reset()
except FileNotFoundError:
print(f"File '{filename}' not found.")
def which_hash():
sha_type =input("Which hash do you want to calculate: sha256 or sha512? \n").lower()
if sha_type in ['sha256', 'sha512']:
get_hash(sha_type)
verify_checksum(sha_type)
else:
print("Type sha256 or sha512. If you type anything else program will close. .")
sys.exit()
if __name__ == "__main__":
which_hash()
Also its a best practice to use Enum instead of plain text:
from enum import Enum
class HashType(Enum):
SHA256 = 'sha256'
SHA512 = 'sha512'
So you could change
if hash_type == HashType.SHA256:
hash_obj = hashlib.sha256()
elif hash_type == HashType.SHA512:
hash_obj = hashlib.sha512()
def which_hash():
sha_type_input = input("Which hash do you want to calculate: sha256 or sha512? \n").lower()
try:
sha_type = HashType(sha_type_input)
get_hash(sha_type)
verify_checksum(sha_type)
except ValueError:
print("Type either sha256 or sha512. If you type anything else the program will close.")
sys.exit()
Upvotes: 5