Reputation: 955
Trying to take a rough hash of a file in Python 2.x and 3.x. Must use this hash function - not built in one.
Using
get_file_hash("my-file.txt")
3.x works. 2.x gives an error because the type of the incoming value is 'str'.
Error says
value = content[0] << 7
TypeError: unsupported operand type(s) for <<: 'str' and 'int'
Here's the code
def c_mul(a,b):
return eval(hex((int(a) * b) & 0xFFFFFFFF)[:-1])
def get_hash(content):
value = 0
if len(content) > 0:
print (type(content))
print (type(content[0]))
value = content[0] << 7
for char in content:
value = c_mul(1000003, value) ^ char
value = value ^ len(content)
if value == -1:
value = -2
return value
def get_file_hash(filename):
with open(filename, "rb") as pyfile:
return get_hash(pyfile.read())
How can I fix get_hash or get_file_hash so this works on 2.x and 3.x?
Upvotes: 2
Views: 194
Reputation: 369484
file.read()
for a file open with binary mode return bytes
in Python 3, and str
(== bytes
) in Python 2.
But iteratring bytes
objects yields different result in both version:
>>> list(b'123') # In Python 3.x, yields `int`s
[49, 50, 51]
>>> list(b'123') # In Python 2.x, yields `string`s
['1', '2', '3']
Use bytearray
. Iterating it will yields int
s in both version.
>>> list(bytearray(b'123')) # Python 3.x
[49, 50, 51]
>>> list(bytearray(b'123')) # Python 2.x
[49, 50, 51]
def c_mul(a,b):
return (a * b) & 0xFFFFFFFF
def get_hash(content):
content = bytearray(content) # <-----
value = 0
if len(content) > 0:
value = content[0] << 7
for char in content:
value = c_mul(1000003, value) ^ char
value = value ^ len(content)
if value == -1:
value = -2
return value
def get_file_hash(filename):
with open(filename, "rb") as pyfile:
return get_hash(pyfile.read())
BTW, I modified c_mul
not to use hex
, eval
. (I assumed that you used it to remove trailing L
in Python 2.x).
>>> hex(289374982374)
'0x436017d0e6L'
# ^
Upvotes: 4