Reputation: 87
I am looking for a way to verify the integrity of a virtualenv at runtime.
A bit more explicitly put; We are deploying a python project to a production server. Under the deploy a virtualenv is created and packages are installed using pip and setuptools (since our own package is not distributed). So far everything is in order. This is a medical grade application so at every run we need to verify that the virtualenv has not been altered. Checking versions against pip list (or pipfile.lock if we switch to pipenv) is not enough (as I understand things). We also need to verify that nothing has been altered within the virtualenv (e.g. changes in the code under virtualenv/lib/python/site-packages). Is there a pythonic way to do this?
Upvotes: 4
Views: 396
Reputation: 681
I think this might do it:
import hashlib, os
basedir = os.path.abspath(os.path.dirname(__file__))
hasher = hashlib.md5()
directory = "/".join((basedir,"venv"))
bs=4096
def flatten(d):
for path, dirs, filenames in os.walk(d):
for filename in filenames:
yield os.path.join(path, filename)
if os.path.exists(directory):
dircontent = list(flatten(directory))
for item in dircontent:
with open(item, "rb") as _f:
buf = _f.read(bs)
while len(buf) > 0:
hasher.update(buf)
buf = _f.read(bs)
else:
exit(1)
print(hasher.hexdigest())
The flatten
function is fairly straightforward, it just walks the venv and creates one huge list with all the filepaths from top to bottom of the file tree, beginning at whatever path you provide as d
-parameter. I took that from here
I then open the file in ro,b mode and read small chunks into a buffer (as to not clog the systems memory in case of unexpected large files) and update the md5-hashobject with the buffers content. This is being done for every file in venv.
Not sure if this is a viable solution for you, but it was great fun doing this, so thanks for your question :)
Upvotes: 4