shahjapan
shahjapan

Reputation: 14335

How to create full compressed tar file using Python?

How can I create a .tar.gz file with compression in Python?

Upvotes: 188

Views: 215273

Answers (11)

hau
hau

Reputation: 11

I am using this to generate tar.gz file without containing the main folder.

import tarfile
import os.path

source_location = r'C:\Users\username\Desktop\New folder'
output_name = r'C:\Users\username\Desktop\new.tar.gz'

# ---------------------------------------------------
#  --- output new.tar.gz with 'New folder' inside ---
#  -> new.tar.gz/New folder/aaaa/a.txt 
#  -> new.tar.gz/New folder/bbbb/b.txt
# ---------------------------------------------------
# def make_tarfile(output_filename, source_dir):
#     with tarfile.open(output_filename, "w:gz") as tar:
#         # tar.add(source_dir, arcname=os.path.basename(source_dir))
#         tar.add(source_dir, arcname=os.path.sep(source_dir))


# ---------------------------------------------------
#  --- output new.tar.gz without 'New folder' inside ---
#  -> new.tar.gz/aaaa/a.txt 
#  -> new.tar.gz/bbbb/b.txt
# ---------------------------------------------------
def make_tarfile(output_filename, source_dir):
    with tarfile.open(output_filename, "w:gz") as tar:
        for root, _, files in os.walk(source_dir):
            for file in files:
                file_path = os.path.join(root, file)
                arcname = os.path.relpath(file_path, source_dir)
                tar.add(file_path, arcname=arcname)

try:
    make_tarfile(output_name, source_location)

except Exception as e:
    print(f"Error: {e}")

Upvotes: 1

Eduardo Lucio
Eduardo Lucio

Reputation: 2447

Just restating @George V. Reilly 's excellent answer, but in a clearer form...

import tarfile


fd_path="/some/folder/path/"
fl_name="some_file_name.ext"
targz_fd_path_n_fl_name="/some/folder/path/some_file_name.tar.gz"

with tarfile.open(targz_fd_path_n_fl_name, "w:gz") as tar:
    tar.add(fd_path + fl_name, fl_name)

As @Brōtsyorfuzthrāx pointed out (but in another way) if you leave the "add" method second argument then it'll give you the entire path structure of fd_path + fl_name in the tar file.

Of course you can use...

import tarfile
import os

fd_path_n_fl_name="/some/folder/path/some_file_name.ext"
targz_fd_path_n_fl_name="/some/folder/path/some_file_name.tar.gz"

with tarfile.open(targz_fd_path_n_fl_name, "w:gz") as tar:
    tar.add(fd_path_n_fl_name, os.path.basename(fd_path_n_fl_name))

... if you don't want to use or don't have the folder path and file name separated.

Thanks!🤗

Upvotes: -1

Yitzchak
Yitzchak

Reputation: 3416

best performance and without the . and .. in compressed file! See vulnerability warning below:

NOTICE (thanks MaxTruxa):

this answer is vulnerable to shell injections. Please read the security considerations from the docs. Never pass unescaped strings to subprocess.run, subprocess.call, etc. if shell=True. Use shlex.quote to escape (Unix shells only).

I'm using it locally - so it's good for my needs.

subprocess.call(f'tar -cvzf {output_filename} *', cwd=source_dir, shell=True)

the cwd argument changes directory before compressing - which solves the issue with the dots.

the shell=True allows wildcard usage (*)

WORKS also for a directory recursively

Upvotes: -5

nyxgear
nyxgear

Reputation: 169

shutil.make_archive is very convenient for both files and directories (contents recursively added to the archive):

import shutil

compressed_file = shutil.make_archive(
        base_name='archive',   # archive file name w/o extension
        format='gztar',        # available formats: zip, gztar, bztar, xztar, tar
        root_dir='path/to/dir' # directory to compress
)

Upvotes: 9

THAVASI.T
THAVASI.T

Reputation: 181

In this tar.gz file compress in open view directory In solve use os.path.basename(file_directory)

import tarfile

with tarfile.open("save.tar.gz","w:gz") as tar:
      for file in ["a.txt","b.log","c.png"]:
           tar.add(os.path.basename(file))

its use in tar.gz file compress in directory

Upvotes: 2

ZappedC64
ZappedC64

Reputation: 1

Minor correction to @THAVASI.T's answer which omits showing the import of the 'tarfile' library, and does not define the 'tar' object which is used in the third line.

import tarfile

with tarfile.open("save.tar.gz","w:gz") as tar:
    for file in ["a.txt","b.log","c.png"]:
        tar.add(os.path.basename(file))

Upvotes: 0

alper
alper

Reputation: 3390

In addition to @Aleksandr Tukallo's answer, you could also obtain the output and error message (if occurs). Compressing a folder using tar is explained pretty well on the following answer.

import traceback
import subprocess

try:
    cmd = ['tar', 'czfj', output_filename, file_to_archive]
    output = subprocess.check_output(cmd).decode("utf-8").strip() 
    print(output)          
except Exception:       
    print(f"E: {traceback.format_exc()}")       

Upvotes: 3

George V. Reilly
George V. Reilly

Reputation: 16313

To build a .tar.gz (aka .tgz) for an entire directory tree:

import tarfile
import os.path

def make_tarfile(output_filename, source_dir):
    with tarfile.open(output_filename, "w:gz") as tar:
        tar.add(source_dir, arcname=os.path.basename(source_dir))

This will create a gzipped tar archive containing a single top-level folder with the same name and contents as source_dir.

Upvotes: 320

Aleksandr Tukallo
Aleksandr Tukallo

Reputation: 1427

Previous answers advise using the tarfile Python module for creating a .tar.gz file in Python. That's obviously a good and Python-style solution, but it has serious drawback in speed of the archiving. This question mentions that tarfile is approximately two times slower than the tar utility in Linux. According to my experience this estimation is pretty correct.

So for faster archiving you can use the tar command using subprocess module:

subprocess.call(['tar', '-czf', output_filename, file_to_archive])

Upvotes: 26

CNBorn
CNBorn

Reputation: 1358

import tarfile
tar = tarfile.open("sample.tar.gz", "w:gz")
for name in ["file1", "file2", "file3"]:
    tar.add(name)
tar.close()

If you want to create a tar.bz2 compressed file, just replace file extension name with ".tar.bz2" and "w:gz" with "w:bz2".

Upvotes: 115

Alex Martelli
Alex Martelli

Reputation: 881497

You call tarfile.open with mode='w:gz', meaning "Open for gzip compressed writing."

You'll probably want to end the filename (the name argument to open) with .tar.gz, but that doesn't affect compression abilities.

BTW, you usually get better compression with a mode of 'w:bz2', just like tar can usually compress even better with bzip2 than it can compress with gzip.

Upvotes: 35

Related Questions