Reputation: 57
I would like to create 2 scripts. First would be responsible for traversing through all subdirectories in parent folder, looking for files with extension "*.mp4", "*.txt","*.jpg"
and if folder (for example testfolder
) with such three files is found, another scripts performs operation of creating archive testfolder.tar
.
Here is my directory tree for testing those scripts: https://i.sstatic.net/nplWD.jpg
rootDirectory
contains parentDirectory1
and parentDirectory2
. parentDirectories
contain childDirectories
.
Here is code of dirScanner.py
trying to print extensions of files in subdirs:
import os
rootdir = r'C:\Users\user\pythonprogram\rootDirectory'
for directory in os.walk(rootdir):
for subdirectory in directory:
extensions = []
if os.path.isfile(os.curdir):
extensions.append(os.path.splitext(os.curdir)[-1].lower())
print(extensions)
However it absolutely does not work as I expect it to work.
How should I traverse through parentDirectories
and childDirectiories
in rootDirectory
?
I would like to keep it simple, in the way "Okay I'm in this directory, the files of this directory are XXX, Should/Shouldn't pack them"
Also, this is my other script that should be responsible for packing files for specified path. I'm trying to learn how to use classes however I don't know if I understand it correctly.
import tarfile
class folderNeededToBePacked:
def __init__(self, name, path):
self.path = path
self.name = name
def pack(self):
tar = tarfile.open(r"{0}/{1}.tar".format(self.path, self.name), "w")
for file in self.path:
tar.add(file)
tar.close()
I'd be thankful for all tips and advices how to achieve the goal of this task.
Upvotes: 0
Views: 611
Reputation: 59426
It's a simple straight forward task without many complex concepts which would call for being implemented as a class, so I would not use one for this.
The idea is to walk through all directories (recursively) and if a matching directory is found, pack the three files of this directory into the archive.
To walk through the directory tree you need to fix your usage of 'os.walk()' according to its documentation:
tar = tarfile.open(...)
for dirpath, dirnames, filenames in os.walk(root):
found_files = dir_matching(root, dirpath)
for found_file in found_files:
tar.add(found_file)
tar.close()
And the function dir_matching()
should return a list of the three found files (or an empty list if the directory doesn't match, i.e. at least one of the three necessary files is missing):
def dir_matching(root, dirpath):
jpg = glob.glob(os.path.join(root, dirpath, '*.jpg')
mp4 = glob.glob(os.path.join(root, dirpath, '*.mp4')
txt = glob.glob(os.path.join(root, dirpath, '*.txt')
if jpg and mp4 and txt:
return [ jpg[0], mp4[0], txt[0] ]
else:
return []
Of course you could add more sophisticated checks e.g. whether exactly one jpg etc. is found, but that depends on your concrete specifications.
Upvotes: 1