Reputation: 18310
Background: I have an iterative hash algorithm I need to compute from a Python script and a Java web application.
Psuedo code:
hash = sha256(raw)
for x=1 to 64000 hash = sha256(hash)
where hash is a byte array of length 32, and not a hex string of length 64.
The reason I want to keep it in bytes is because, though Python can convert to hex string in between each iteration and keep the processing time under a second, Java takes 3 seconds for the String overhead.
So, the Java code looks like this:
// hash one time...
byte[] result = sha256(raw.getBytes("UTF-8"));
// then hash 64k-1 more times
for (int x = 0; x < 64000-1; x++) {
result = sha256(result);
}
// hex encode and print result
StringBuilder sb = new StringBuilder();
Formatter formatter = new Formatter(sb);
for (int i=0; i<buf.length; i++) {
formatter.format("%02x", buf[i]);
}
System.out.println(sb.toString());
And the Python code looks like this:
import hashlib
# hash 1 time...
hasher = hashlib.sha256()
hasher.update(raw)
digest = hasher.digest()
# then hash 64k-1 times
for x in range (0, 64000-1):
# expect digest is bytes and not hex string
hasher.update(digest)
digest = hasher.digest()
print digest.encode("hex")
The Python result calculated the hash on the hex representation of the first digest (String), rather than the raw digest bytes. So, I get varying outputs.
Upvotes: 4
Views: 4487
Reputation: 1345
Method .update of hasher appends argument to previous text (Python docs). Instead you should create new hasher each time you want to compute digest.
import hashlib
# hash 1 time...
digest = hashlib.sha256(raw).digest()
# then hash 64k-1 times
for x in range(0, 64000-1):
digest = hashlib.sha256(digest).digest()
print digest.encode("hex")
Upvotes: 6