Reputation: 188
I have .gz zipped files within multiple folders that are all within a main folder called "usa". I was able to extract an individual file using the code below.
import gzip
import shutil
source=r"C:\usauc300.dbf.gz"
output=r"C:\usauc300.dbf"
with gzip.open(source,"rb") as f_in, open(output,"wb") as f_out:
shutil.copyfileobj(f_in, f_out)
I have searched high and low but can't find an equivalent to the command line option gzip -dr.....
which means "decompress recursive" and will go through each folder and extract the contents to the same location while deleting the original zipped file. Does anyone know how I can use python to loop through folders within a folder, find any zipped files and unzip them to the same location while replacing the unzipped file with the zipped one?
Upvotes: 4
Views: 16383
Reputation: 1
Below is a code to extract multiple files in a folder and replace those with unzipped files. This works for .gz files
import os, gzip, shutil
dir_name = r'C:\Users\Desktop\log file working\New folder'
def gz_extract(directory):
extension = ".gz"
os.chdir(directory)
for item in os.listdir(directory): # loop through items in dir
if item.endswith(extension): # check for ".gz" extension
gz_name = os.path.abspath(item) # get full path of files
file_name = (os.path.basename(gz_name)).rsplit('.',1)[0] #get file name for file within
with gzip.open(gz_name,"rb") as f_in, open(file_name,"wb") as f_out:
shutil.copyfileobj(f_in, f_out)
os.remove(gz_name) # delete zipped file
gz_extract(dir_name)
Upvotes: 0
Reputation: 14255
It may not answer this specific question, but for those looking to extract a gzipped directory structure: that would be a job for shutil.unpack_archive.
For example:
import shutil
shutil.unpack_archive(
filename='path/to/archive.tar.gz', extract_dir='where/to/extract/to'
)
Upvotes: 3
Reputation: 181
You can use this format too.
import tarfile, glob
base_dir = '/home/user/pipelines/data_files/'
for name in glob.glob(base_dir + '*.gz'):
print(name)
tf = tarfile.open(name)
tf.extractall(base_dir + 'unzipped_files/')
print('-- Done')
Upvotes: 2
Reputation: 731
I believe that's because gzip never operates over directories, it acts as a compression algorithm unlike zip and tar where we could compress directories. python's implementation of gzip is to operate on files. However recursive traversal of a directory tree is easy if we look at the os.walk call.
(I haven't tested this)
def gunzip(file_path,output_path):
with gzip.open(file_path,"rb") as f_in, open(output_path,"wb") as f_out:
shutil.copyfileobj(f_in, f_out)
def recurse_and_gunzip(root):
walker = os.walk(root)
for root,dirs,files in walker:
for f in files:
if fnmatch.fnmatch(f,"*.gz"):
gunzip(f,f.replace(".gz",""))
Upvotes: 5