arash javanmard
arash javanmard

Reputation: 1387

Python ZipFile module extracts password protected zips slowly

i am trying to write a python-script, which should extract a zip file:

Board: Beagle-Bone black ~ 1GHz Arm-Cortex-a8, debian wheezy Zipfile: /home/milo/my.zip, ~ 8 MB

>>> from zipfile import ZipFile
>>> zip = ZipFile("/home/milo/my.zip")
>>> zip.extractall(pwd="tst")

other solutions with opening and reading-> writing the zipfile and extracting even particular file have the same effect. extracting take about 3-4 minutes.

Extracting the same file with just using unzip-tool takes less than 2 seconds.

Does anyone know what is wonrg with my code, or even with python zipfile lib??

Thanks Ajava

Upvotes: 6

Views: 3730

Answers (2)

Carl Cheung
Carl Cheung

Reputation: 548

Copy from my answer https://stackoverflow.com/a/72513075/10860732

It's quite stupid that Python doesn't implement zip decryption in pure c.

So I make it in cython, which is 17 times faster.

Just get the dezip.pyx and setup.py from this gist.

https://gist.github.com/zylo117/cb2794c84b459eba301df7b82ddbc1ec

And install cython and build a cython library

pip3 install cython
python3 setup.py build_ext --inplace

Then run the original script with two more lines.

import zipfile

# add these two lines
from dezip import _ZipDecrypter_C
setattr(zipfile, '_ZipDecrypter', _ZipDecrypter_C)

z = zipfile.ZipFile('./test.zip', 'r')
z.extractall('/tmp/123', None, b'password')

Upvotes: 1

Tanuj Mathur
Tanuj Mathur

Reputation: 1458

This seems to be a documented issue with the ZipFile module in Python 2.7. If you look at the documentation for ZipFile, it clearly mentions:

Decryption is extremely slow as it is implemented in native Python rather than C.

If you need faster performance, you can either invoke an an external program (like unzip or 7zip) from your code, or make sure the zip files you are working with are not password protected.

Upvotes: 7

Related Questions