abergmeier
abergmeier

Reputation: 14052

Extracting archives of multiple types

Is there any module in Python, which lets me extract a variety of archives? I need to extract zip, tar.bz2, tar.z, rar and others. Right now it looks like I have to do the archive type detection manually and also to write an extraction for every single one.

Pure Python would be prefered.

Upvotes: 5

Views: 3613

Answers (4)

Fuzl
Fuzl

Reputation: 56

this thread is old but I ran into this problem again. I have tried Patool option or pyunpack (which relies on Patool) but would highly recommend not to use a Patool-based option and go for the builtin shutil lib as Patool seems to be unsupported for some time now. I myself ran into a bug because of a changed lib name.

As crennie answered, I went for the high-level builtin shutil lib, and below is my code to add 7z and rar formats to shutil capabilities which is the full initial question. Note that this requires p7zip-full and p7zip-rar which can be easily setup with a pip install :

def extractfiles(zipname, output_dir):
    """Extract files with 7z utils.

    -aoa switch asks for automatic overwrite without prompting user.
    """
    logging.info(f"Extracting {zipname} to {output_dir}")
    pipe = Popen(["7z", "x", "-aoa", "-bd", zipname, f"-o{output_dir}"], stderr=STDOUT, stdout=PIPE)
    return(pipe.communicate())


def register_extensions():
    """Register additionnal archive formats supported by 7zip in shutil."""
    shutil.register_unpack_format('rar', ['.rar', '.RAR'], extractfiles)
    shutil.register_unpack_format('7z', ['.7z', '.7Z'], extractfiles)

Upvotes: 1

crennie
crennie

Reputation: 674

Since 3.2 it looks like shutil is adding more archiving functionality, but so far only gziptar, bztar, tar and zip are supported.

You can add your own handlers with shutil.register_archive_format()- this way you wouldn't have to detect the extension manually...but you'd still need to define the extraction yourself.

Upvotes: 1

niroyb
niroyb

Reputation: 114

In the standard library, you already have modules zlib, gzip, bz2, zipfile and tarfile to work with compressed archives.

For rar archives, there is the rarfile module on pypi that has a similar interface to zipfile and works with python 2 and 3.

Upvotes: 1

Sajjan Singh
Sajjan Singh

Reputation: 2553

Check out Patool. I can't attest to how well it works, but there are a few other modules based off of it, though it does depend on external applications for some formats.

patool supports 7z (.7z), ACE (.ace), ADF (.adf), ALZIP (.alz), APE (.ape), AR (.a), ARC (.arc), ARJ (.arj), BZIP2 (.bz2), CAB (.cab), COMPRESS (.Z), CPIO (.cpio), DEB (.deb), DMS (.dms), FLAC (.flac), GZIP (.gz), LRZIP (.lrz), LZH (.lha, .lzh), LZIP (.lz), LZMA (.lzma), LZOP (.lzo), RPM (.rpm), RAR (.rar), RZIP (.rz), SHN (.shn), TAR (.tar), XZ (.xz), ZIP (.zip, .jar) and ZOO (.zoo) formats. It relies on helper applications to handle those archive formats (for example bzip2 for BZIP2 archives).

Upvotes: 4

Related Questions