Reputation: 717
I am new to Python and I am using it to do some data analysis.
My problem is the following: I have a directory with many subdirectories, each one of which contains a large number of data files.
I already wrote a Python script which, when executed in one of those subdirectories, performs the data analysis and writes it on a output file. The script includes some shell commands that I call using os.system()
, so I have to "be" in one of the subdirectories for it to work.
How can I write a function that automatically:
I guess that this could be done in some way using os.walk()
but I didn't really understand how it works.
PS I am aware of the existence of this post but it doesn't solve my problem.
PPS Maybe I should point out that my function does not take the directory name as argument. Actually it takes no argument.
Upvotes: 5
Views: 12771
Reputation: 2543
I was doing something similar, cd
into every subdirectory and run git
commands, etc. Shortened version
import os
import pathlib
import subprocess
if __name__ == "__main__":
# dir path of a script, subdirectories are here
ROOT_PATH = os.getcwd()
# all files, folders in script's directory
for name in os.listdir(ROOT_PATH):
dir_path = os.path.abspath(name)
# if a subdirectory
if os.path.isdir(dir_path):
# cd to subdirectory
os.chdir(dir_path)
# could run a script subprocess.run(["python", "my_script.py"])
# or you could run all commands here one by one
git_log = subprocess.getoutput(['git', 'log', '-n1'])
print(git_log + "\n")
# move back to script's dir
os.chdir(ROOT_PATH)
Upvotes: 0
Reputation: 591
To change your working directory in Python you need:
os.chdir(your_path)
You can then recursively run your script.
Example Code:
import os
directory_to_check = "your_dir" # Which directory do you want to start with?
def my_function(directory):
print("Listing: " + directory)
print("\t-" + "\n\t-".join(os.listdir("."))) # List current working directory
# Get all the subdirectories of directory_to_check recursively and store them in a list:
directories = [os.path.abspath(x[0]) for x in os.walk(directory_to_check)]
directories.remove(os.path.abspath(directory_to_check)) # If you don't want your main directory included
for i in directories:
os.chdir(i) # Change working Directory
my_function(i) # Run your function
I don't know how your script works because your question is quite general, so therefore I can only give a general answer....
But I think what you need is:
os.walk alone won't work
I hope this helps! Good luck!
Upvotes: 3
Reputation: 1
If you want to do a certain action for every sub-folder of a folder, one way is to write a recursive function, processing each directory one at a time. I hope my example helps a little bit: http://pastebin.com/8G7JzcQ2
Upvotes: 0
Reputation: 849
os.walk should work perfectly for what you want to do. Get started with this code and you should see what you need to do:
import os
path = r'C:\mystartingpath'
for (path, dirs, files) in os.walk(path):
print "Path:", path
print "\nDirs:"
for d in dirs:
print '\t'+d
print "\nFiles:"
for f in files:
print '\t'+f
print "----"
What this code will do is show you that os.walk will iterate through all subdirectories of your chosen starting path. Once in each directory, you can get the full path to each file name by concatenating the path and the file name. For example:
path_to_intersting_file = path+'\\'+filename
# (This assumes that you saved your filename into a variable called filename)
With the full path to each file, you can perform your analysis while in the os.walk for loop. Add your analysis code so that the for loop is doing more than just printing contents.
Upvotes: 3
Reputation: 153
This would be done like this.
for dir in os.listdir(your_root_directory):
yourFunction(dir)
The os.listdir
method returns the list of directories in the root directory only.
The os.walk
method however traverses the directories recursivelly, which makes it useful for other things and os.listdir
might be better.
However, for the sake of completenes, here is a os.walk
option:
for dir in next(os.walk(your_directory))[1]:
yourFunction(dir)
Notice that the os.walk
is a generator, hence the next call. The first next call, produces a tuple root, dirs, files. And the root in this case is your directory. You are only interested in dirs - the list of subdirectories, so you index [1].
Upvotes: 1