drvenom5140
drvenom5140

Reputation: 13

Compress Large Files with Python, really fast

I'm using this function to gzip a file:


def zip_file(path_data,path_zip,File):
    with open(os.path.join(path_data,File), "rb") as f_in, gzip.open(os.path.join(path_zip,File) + ".gz", "wb") as f_out:
        shutil.copyfileobj(f_in, f_out,length=16*1024*1024)

But it takes 1604.954 seconds to gzip a 14 GB file with 4 columns, I have to process 96 files like this.

Upvotes: 0

Views: 2246

Answers (1)

Mark Adler
Mark Adler

Reputation: 112239

Add a parameter to your gzip.open with compresslevel=1. You can play with the level between 1 and 5 (default is 6, which apparently you don't like). See where you prefer the trade off in time vs. compression ratio.

By the way, you shouldn't call it "zip_file". It is not a zip file, which is an entirely different thing from a gzip file. Call it "gzip_file", or something else.

Upvotes: 1

Related Questions