user1781837
user1781837

Reputation: 73

Python equivalent of Perl Digest::MD5 functions

As part of a work project I am porting a Perl library to Python. I'm comfortable with Python, much (much) less so with Perl.

The perl code uses Digest::MD5. This module has three functions:

I can replicate md5_hex with something like this:

import hashlib
string = 'abcdefg'
print(hashlib.md5(string.encode()).hexdigest())

Which works fine (same inputs give same outputs at least). I can't seem to get anything to match for the other two functions.

It doesn't help that string encodings are really not something I've done much with. I've been interpreting the perl functions as saying they take an md5 digest and then re-encode in binary or base64, something like this:

import hashlib
import base64
string = 'abcdefg'
md5_string = hashlib.md5(string.encode()).hexdigest()
print(base64.b64encode(md5_string))

but maybe that's wrong? I'm sure there's something fundamental I'm just missing.

The Perl doc is here: https://metacpan.org/pod/Digest::MD5

Upvotes: 0

Views: 902

Answers (2)

Sinan Ünür
Sinan Ünür

Reputation: 118128

First, note Digest::MD5 documentation:

Note that the base64 encoded string returned is not padded to be a multiple of 4 bytes long. If you want interoperability with other base64 encoded md5 digests you might want to append the redundant string "==" to the result.

Second, note that you want to Base64 encode the hash, not the hex representation of it:

print(base64.b64encode(hashlib.md5(string.encode()).digest()))

esZsDxSN6VGbi9JkMSxNZA==

perl -MDigest::MD5=md5_base64 -E 'say md5_base64($ARGV[0])' abcdefg

esZsDxSN6VGbi9JkMSxNZA

Upvotes: 0

The first one would simply be calling .digest method on the md5:

>>> from hashlib import md5
>>> s = 'abcdefg'
>>> md5(s.encode()).digest()
b'z\xc6l\x0f\x14\x8d\xe9Q\x9b\x8b\xd2d1,Md'

And md5_base64 is the digest but base64-encoded:

>>> base64.b64encode(md5(s.encode()).digest())
b'esZsDxSN6VGbi9JkMSxNZA=='

However, Perl doesn't return the hash padded, thus to be compatible, you'd strip the = padding characters:

>>> base64.b64encode(md5(s.encode()).digest()).strip(b'=')
b'esZsDxSN6VGbi9JkMSxNZA'

Upvotes: 3

Related Questions