Reputation: 3106
I have compressed a file into several chunks using 7zip:
HAVE:
foo.txt.gz.001
foo.txt.gz.002
foo.txt.gz.003
foo.txt.gz.004
foo.txt.gz.005
WANT:
foo.txt
How do I unzip and combine these chunks to get a single file using python?
Upvotes: 0
Views: 5945
Reputation: 1
import os, gzip, shutil
dir_name = '/Users/username/Desktop/data'
def gz_extract(directory):
extension = ".gz"
os.chdir(directory)
for item in os.listdir(directory): # loop through items in dir
if item.endswith(extension): # check for ".gz" extension
gz_name = os.path.abspath(item) # get full path of files
file_name = (os.path.basename(gz_name)).rsplit('.',1)[0] #get file name for file within
with gzip.open(gz_name,"rb") as f_in, open(file_name,"wb") as f_out:
shutil.copyfileobj(f_in, f_out)
os.remove(gz_name) # delete zipped file
gz_extract(dir_name)
Upvotes: 0
Reputation: 6575
First, get the list of all files.
files = ['/path/to/foo.txt.gz.001', '/path/to/foo.txt.gz.002', '/path/to/foo.txt.gz.003']
Then iterate over each file and append to a result file.
with open('./result.gz', 'ab') as result: # append in binary mode
for f in files:
with open(f, 'rb') as tmpf: # open in binary mode also
result.write(tmpf.read())
Then extract is using zipfile lib. You could use tempfile to avoid handle with temporary zip file.
Upvotes: 3
Reputation: 1468
First you must extract all the zip files sequentially:
import zipfile
paths = ["path_to_1", "path_to_2" ]
extract_paths = ["path_to_extract1", "path_to_extrac2"]
for i in range(0, paths):
zip_ref = zipfile.ZipFile(paths[i], 'r')
zip_ref.extractall(extract_paths[i])
zip_ref.close()
Next you can go to the extracted location and read()
individual files with open
into a string
. Concatenate those strings and save to foo.txt
.
Upvotes: 0