Reputation: 73
As part of a work project I am porting a Perl library to Python. I'm comfortable with Python, much (much) less so with Perl.
The perl code uses Digest::MD5. This module has three functions:
md5($data)
takes in data and spits out md5 digest in binarymd5_hex($data)
takes in data and spits out md5 digest in hexmd5_base64($data)
takes in data and spits out md5 digest in base64 encodingI can replicate md5_hex with something like this:
import hashlib
string = 'abcdefg'
print(hashlib.md5(string.encode()).hexdigest())
Which works fine (same inputs give same outputs at least). I can't seem to get anything to match for the other two functions.
It doesn't help that string encodings are really not something I've done much with. I've been interpreting the perl functions as saying they take an md5 digest and then re-encode in binary or base64, something like this:
import hashlib
import base64
string = 'abcdefg'
md5_string = hashlib.md5(string.encode()).hexdigest()
print(base64.b64encode(md5_string))
but maybe that's wrong? I'm sure there's something fundamental I'm just missing.
The Perl doc is here: https://metacpan.org/pod/Digest::MD5
Upvotes: 0
Views: 902
Reputation: 118128
First, note Digest::MD5 documentation:
Note that the base64 encoded string returned is not padded to be a multiple of 4 bytes long. If you want interoperability with other base64 encoded md5 digests you might want to append the redundant string "==" to the result.
Second, note that you want to Base64 encode the hash, not the hex representation of it:
print(base64.b64encode(hashlib.md5(string.encode()).digest()))
esZsDxSN6VGbi9JkMSxNZA==
perl -MDigest::MD5=md5_base64 -E 'say md5_base64($ARGV[0])' abcdefg
esZsDxSN6VGbi9JkMSxNZA
Upvotes: 0
Reputation: 133919
The first one would simply be calling .digest
method on the md5
:
>>> from hashlib import md5
>>> s = 'abcdefg'
>>> md5(s.encode()).digest()
b'z\xc6l\x0f\x14\x8d\xe9Q\x9b\x8b\xd2d1,Md'
And md5_base64
is the digest but base64-encoded:
>>> base64.b64encode(md5(s.encode()).digest())
b'esZsDxSN6VGbi9JkMSxNZA=='
However, Perl doesn't return the hash padded, thus to be compatible, you'd strip the =
padding characters:
>>> base64.b64encode(md5(s.encode()).digest()).strip(b'=')
b'esZsDxSN6VGbi9JkMSxNZA'
Upvotes: 3