Reputation: 14052
Is there any module in Python, which lets me extract a variety of archives? I need to extract zip, tar.bz2, tar.z, rar and others. Right now it looks like I have to do the archive type detection manually and also to write an extraction for every single one.
Pure Python would be prefered.
Upvotes: 5
Views: 3613
Reputation: 56
this thread is old but I ran into this problem again. I have tried Patool option or pyunpack (which relies on Patool) but would highly recommend not to use a Patool-based option and go for the builtin shutil lib as Patool seems to be unsupported for some time now. I myself ran into a bug because of a changed lib name.
As crennie answered, I went for the high-level builtin shutil lib, and below is my code to add 7z and rar formats to shutil capabilities which is the full initial question.
Note that this requires p7zip-full
and p7zip-rar
which can be easily setup with a pip install
:
def extractfiles(zipname, output_dir):
"""Extract files with 7z utils.
-aoa switch asks for automatic overwrite without prompting user.
"""
logging.info(f"Extracting {zipname} to {output_dir}")
pipe = Popen(["7z", "x", "-aoa", "-bd", zipname, f"-o{output_dir}"], stderr=STDOUT, stdout=PIPE)
return(pipe.communicate())
def register_extensions():
"""Register additionnal archive formats supported by 7zip in shutil."""
shutil.register_unpack_format('rar', ['.rar', '.RAR'], extractfiles)
shutil.register_unpack_format('7z', ['.7z', '.7Z'], extractfiles)
Upvotes: 1
Reputation: 674
Since 3.2 it looks like shutil
is adding more archiving functionality, but so far only gziptar, bztar, tar and zip are supported.
You can add your own handlers with shutil.register_archive_format()
- this way you wouldn't have to detect the extension manually...but you'd still need to define the extraction yourself.
Upvotes: 1
Reputation: 114
In the standard library, you already have modules zlib
, gzip
, bz2
, zipfile
and tarfile
to work with compressed archives.
For rar archives, there is the rarfile module on pypi that has a similar interface to zipfile and works with python 2 and 3.
Upvotes: 1
Reputation: 2553
Check out Patool. I can't attest to how well it works, but there are a few other modules based off of it, though it does depend on external applications for some formats.
patool supports 7z (.7z), ACE (.ace), ADF (.adf), ALZIP (.alz), APE (.ape), AR (.a), ARC (.arc), ARJ (.arj), BZIP2 (.bz2), CAB (.cab), COMPRESS (.Z), CPIO (.cpio), DEB (.deb), DMS (.dms), FLAC (.flac), GZIP (.gz), LRZIP (.lrz), LZH (.lha, .lzh), LZIP (.lz), LZMA (.lzma), LZOP (.lzo), RPM (.rpm), RAR (.rar), RZIP (.rz), SHN (.shn), TAR (.tar), XZ (.xz), ZIP (.zip, .jar) and ZOO (.zoo) formats. It relies on helper applications to handle those archive formats (for example bzip2 for BZIP2 archives).
Upvotes: 4